How to push a large number of RDF into JFS repository, is there any best practice?

Hao Chen

(73●1●9●4) edited

Jan 16 '13, 10:13 p.m.

I need a better performance to import RDF. I have such a requirement that I need to import 10,000 + RDF data into JFS repository as fast as I can. I have to query the data immediately after it is imported because that I have to do some validation. Now I use JFS storage RESTful services API to push the data and query them by SPARQL.

According to this : https://jazz.net/wiki/bin/view/Main/DavidJohnson_IndexingAndQuery

The index process is asynchronous. I use the way in https://jazz.net/wiki/bin/view/Main/JFSIndexStoreQueryAPI#Using_If_Modified_Since_Last_Mod

But I think it is not so efficient. And a mass of HTTP requests is too heavy. And the index process are really slow behind the insert action. Those data should always takes hours which is not acceptable for me. Is there any other efficiency way to write RDF data into JFS ? like directly write DB or others. I think 10,000 is not so large. Thanks a lot.

0 votes

2 answers

6,184 views

0 votes

Accepted answer

Permanent link

Jerome Lanneluc

JAZZ DEVELOPER (210●1●5) Jan 17 '13, 3:53 a.m.

You can save time spent in the HTTP layer by using https://jazz.net/wiki/bin/view/Main/JFSBulkOperations.
If you find the indexing slow, you might want to look at the speed of the disk where the indices are stored (./conf/<your application>/indices)

Hao Chen selected this answer as the correct answer

0 votes

One other answer

Permanent link

Philippe Mulet

FORUM MODERATOR / JAZZ DEVELOPER (551●1●2) Jan 17 '13, 3:55 a.m.

There is no way to explicitly control the batching of index activity, and make it transactional as you suggest. There is internal batching when there is a large backlog of resources to index.
We had been thinking of batching JFS changes when committed, but this was never requested by any current consumer.

You should file an enhancement request with usecase at https://jazz.net/jazz/web/projects/Jazz%20Foundation#action=com.ibm.team.workitem.newWorkItem.

Knowing the time of the last committed resource, you can hold queries until indexing status has reached that point, or use conditional queries (https://jazz.net/wiki/bin/view/Main/JFSIndexStoreQueryAPI#Using_If_Modified_Since_Last_Mod) in the interim.

0 votes

Your answer

Dashboards and work items are no longer publicly available, so some links may be invalid. We now provide similar information through other means. Learn more here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here.

By RSS:

Answers

Answers and Comments

Question details

extending

× 11,088

jazz-foundation

× 4,375

Question asked: Jan 16 '13, 12:12 p.m.

Question was seen: 6,184 times

Last updated: Jan 17 '13, 3:55 a.m.

How to push a large number of RDF into JFS repository, is there any best practice?

Hao Chen

2 answers

6,184 views

0 votes

Accepted answer

Jerome Lanneluc

One other answer

Philippe Mulet

Your answer

Follow this question

Question details

Related questions