Quantcast
Viewing all articles
Browse latest Browse all 27

Advice on Technique for LOD-enabled "Digital Library"

Good morning,

I've cross-posted this to both Semantic Web and Libraries because I think there are elements that appeal to both.

I'm going to give a bit of background on the project I'm working on and hopefully I can get some confirmation as to whether or not this is the "best" way to go about doing this. This is Drupal 7, of course. So far, I've been moving in the most obvious direction and using the tools that are most readily available.

Back-story:

We're looking to develop a digital library (which is, I know, a somewhat vague term) containing online books, exhibitions, collections of images, collections of other things, and a few online databases. The Linked Open Data (LOD) part of this is easy for most of the content, be it schema.org or other methodologies.

One big part of this digital library is an online database of sorts containing an authoritative index of botanists and their publications. The goal is to have a semantic web identifier for each of these 10,000 botanists and their 37,000+ publications. And here are my questions and my possible answers.

Questions

Q1: Is importing these into drupal nodes the best way to go? (I think it's the only way to go as I need node references and I'd like to take advantage of all that D7 has built in.)

Q2: Is the Feeds Import module the best/fastest/most reliable way to get data into D7?

Q3: With regard to RDFa and the RDFx module, how much should I care about content negotiation? Obviously RDFa is handled for me, but if/when a remote computer comes poking at my site and requests application/xml+rdf, is something useful going to happen?

Q4: How much should I expect performance to degrade with an extra 47k nodes in my system (if we're talking triples, then it's roughly 500k) and what will happen when we load another database that will increase the amount of data my a factor of 10? I assume I'll need more server resources, but can Drupal gracefully handle this amount of content? Do I have to worry about ARC or SPARQL performance?

I hope that when this is done and working, it will be something that exhibits all that Drupal + RDF can be and although I've had challenges already, I hope that I don't encounter more. :)

Thanks!

--Joel


Viewing all articles
Browse latest Browse all 27

Trending Articles