Hello everyone,
I want to learn from your advice about how to design data structure in the following scenario to be most efficient.
One thread will create data (composed of ID and content), and output to a queue or something (data structure could be chosen to make the scenario more efficient), the data created is very frequently working to produce data, another thread is responsible to aggregate data (for the same ID, aggregate the content and output to a file), and aggregation thread works less frequently -- sleep 10 minutes, aggregate and then sleep again.
I am going to find a solution which could balance,
1. Less performance impact to the data creating thread;
2. To make data aggregation thread works as efficient as possible and consume less memory.
Any advice about how to design data structures?
Currently,
- I am stupidly using a List, appending data by the data creation thread, and I think appending data to the List is less performance impact to data create thread than using Dictionary to insert. Am I correct?
- Read data from begin to end of another thread -- using ID as key into a Dictionary, since there maybe duplicate ID, so when I insert into the Dictionary, I will check if contains key, if yes, I will update the data, or else insert a new one;
- Using lock on the whole List to make thread safe. Does the lock on the whole List is too heavy?
Any smarter ways? :-)
thanks in advance,
George