Monday, 26 August 2013

C# - Loading XML file in parts

C# - Loading XML file in parts

My task is to load new set of data (which is written in XML file) and then
compare it to the 'old' set (also in XML). All the changes are written to
another file.
My program loads new and old file into two datasets, then row after row I
compare primary key from the new set with the old one. When I find
corresponding row, I check all fields and if there are differences with
the old one, I write it to third set and then this set to a file.
Right now I use:
newDS.ReadXml("data.xml");
oldDS.ReadXml("old.xml");
and then I just find rows with corresponding primary key and compare other
fields. It is working quite good for small files.
The problem is that my files may have up to about 4GB. If my new and old
data are that big it is quite problematic to load 8GB of data to memory.
I would like to load my data in parts, but to compare I need whole old
data (or how to get specific row with corresponding primary key from XML
file?).
Another problem is that I don't know the structure of a XML file. It is
defined by user.
What is the best way to work with such a big files? I thought about using
LINQ to XML, but I don't know if it has options that can help with my
problem. Maybe it would be better to leave XML and use something
different?

No comments:

Post a Comment