Csv2rdf: Generating rdf data from csv file using semantic web technologies
Recently, a large amount of Governments and public administrations data are stored on the Web in various file formats, mostly in the tabular data form such as Comma Separated Values (CSV) or Excel. CSV format is simple and practical, but it is difficult to express the relevant metadata such as data provenance, meaning of data fields, relationships between data fields, and user access approaches/rights, etc. In order to make the CSV data semantically structured, interoperable, accessible and reusable for various Web applications, they need to be extracted from the CSV files and converted into the Resource Description Framework (RDF) format that provides superior data assimilation and query functionality. In this paper, we focus on how the Semantic Web technologies are used to convert CSV data into RDF. Therefore, we present a method and techniques to parse the CSV file; the parsed CSV data are complemented with metadata annotations to generate the annotated tabular data model which is then converted into RDF triples. According to the conceptual correspondences between the CSV data model and RDF data model, we designed a set of algorithms to generate RDF triples from the CSV data. Our developed prototype tool, CSV2RDF, is used for evaluating the performance of the proposed method through real-world CSV datasets. The implementation and experimental outcomes demonstrate that our pro-posed method is feasible to generate RDF data from CSV datasets, with satisfactory performance on any size of data sets.
S M Hasan Mahmud, M.A. Hossin, H. Jahan, Sheak Rashed Haider Noori