#
Publication:

A P2P Database Server Based on BitTorrent (2010)

Author(s): Colquhoun J, Watson P

    Abstract: There are many instances when the dissemination of data sets to a wide audience may be beneficial. For example, in the scientific community this would serve as an opportunity to gain verification of results or allow experts to apply different evaluation techniques on the data generated by others. As computational devices are commonly employed in such analysis and evaluation, easing the electronic dissemination of such data is a necessity. One method to achieve data dissemination is via a centralised database accessible over public access computer networks (e.g., Internet). In such a scenario data is stored on a centralised service which allows the retrieval of data as and when required by clients. However, in a centralised approach it may be possible for client requests to rise beyond the level that may be handled by a service. To prevent this occurring a scalable solution is required. Commercial solutions are available that may scale to handle large numbers of client requests. Data is stored in a structured manner, allowing clients to form requests as queries to retrieve specified data items. Unfortunately, any organisation hosting such a service must carry the financial cost of the additional hardware and expertise to maintain the scalable service. If data is popular and is required to be freely available (e.g., historical climate change data), this financial cost will be significant. Therefore, any solution that afforded similar scalability benefits while allowing query based retrieval of data for clients without the financial costs incurred would be desirable to many organisations. Disseminating information in a scalable manner while limiting the financial burden on any one organisation is possible via peer-to-peer file sharing. Peer-to-peer file sharing software allows users to share files with each other without the need for such files to reside centrally. Files reside on the machines of users which cumulatively make up the peer-to-peer network of nodes. When a user request is made for a file those nodes that can satisfy the request may respond. The duplication of files across different nodes affords the scalability of data dissemination as potentially many nodes can service client requests. Cost of maintenance of the peer-to-peer network is, in essence, shared across all users. In this paper we present a software system called Wigan. Wigan adapts the algorithms of the BitTorrent file-sharing protocol to achieve scalable database style information dissemination. In this approach we are removing the financial burden of maintaining a scalable server. As such, Wigan will scale as, and when, more clients request data (join the network). Our approach differs significantly when compared to existing peer-to-peer file sharing techniques as we are dealing with data sets that are the result of client queries as opposed to file instances. This solution is challenging as Wigan must handle requests for data sets that may vary on a per-client basis compared to file instances which do not vary from node to node. In this paper we illustrate our approach to these challenges and present results showing that Wigan can, in certain circumstances, outperform traditional Client-Server database systems.

      • Date: January 2010
      • Series Title: School of Computing Science Technical Report Series
      • Institution: School of Computing Science, University of Newcastle upon Tyne
      • Publication type: Report
      • Bibliographic status: Published

        Keywords: P2P Computing, Databases

        Staff

        Professor Paul Watson
        Professor of Computing Science