Data Generator
From DOC
Note: The default Data Generator requires that the org.tolven.ccr plug-in be included in the tolven-config/plugins.xml file. Without it, the generator will run, but no data will be created in Tolven.
Tolven contains a demo data generator and some extra application features that support demo data creation. This feature has many purposes:
- Test data for performance benchmarks
- Generate demo data for a specific demo
- Populate data to understand the application without first having to generate or backload a lot of data
- Generate reasonable data to test cross-patient queries
- During development, avoid or at least delay, the need to acquire real patient data
The generator is based on a more-or-less average, real-world population -- although in general the real world is limited to western cultures and the most obvious attributes such as name, address, birth, and death rates are based on the US. There is no attempt to match any specific population, although every attempt is made to provide what would be considered reasonable data. For example, only males get prostate cancer. Most, but not all, breast cancer diagnoses, go to females.
Actual vs. Reported Data
When working with generated data, it can sometimes be a little disorienting. Here's an example: Helen Smith is generated. When generated, her data of birth, date of death, and cause of death will be known. However, that information may or may not be known to the health record and certainly future events are never known (for sure). Therefore, you may see a family with unborn, children yet the date of birth is known. But all that future and "accurate" information is kept separate from the "real" data which is very often much less accurate and complete. A very good example of the separation between actual vs. reported data is immunizations: An adult may remember if they had a certain immunization or not and probably wouldn't remember the date of those immunizations. Yet, statistically, we know about how many people did have the immunization.
Tolven is all about integrating patient and clinical data, and this makes for some interesting discrepancies. Ask a person if they are allergic to penicillin and they probably know whether they are or not. However, healthcare workers may or may not accurately record that information. The healthcare worker may have no objective data, other than to simply transcribe what the patient reports.
The point is that there is a skew in the data because this is Tolven. In general, the data generator assumes that patients know their demographics and vital statistics; and unless the provider physically delivered the patient at birth, their data is -- at best -- hearsay from the patient. Patient data might vary from clinical data in cases where there is no Tolven-based electronic connection between the patient and his or her providers (and often between providers).
This is quite apparent in demographic data generation. Typical EMPI's are a compromise in data accuracy. Most patients visit providers irregularly, so they may move or get married between visits. Providers are often distributed, and even within a single enterprise (let alone a community) have little choice but to guess if two records belong to one person or to two separate persons. Patients, on the other hand, don't have that same problem. For example, if the patient visits two providers and, assuming they use Tolven to communicate with their providers, then there would never be a need to "guess" as to the patient's identity. So, while two providers may know whether two medical records are for the same person or not, Tolven, assuming the patient is in the picture, has no need to emulate "inaccurate" demographic data as would be needed in a typical provider-only system.
Generating Demo Data
When creating a Tolven Account (think of an account as either a family or a provider), a demo option should be available. For a clinical account, demo data can also be added at any time after initial account creation.
On the demo data generation page, you can select the number of patients to create. Additional options will be added as well.
When you click Create Patients, the data generator creates the number of people that you specify. It also tries to create families. They get an address (in the U.S.) and thus are assumed to live together. There are usually enough people left over to represent single people.
The people created will be assigned to a newly created account, not the account the user is currently logged into. Therefore, don't look for the data just yet. After the data is created, you can log out. When you log back in, you should be given a choice to select the new account (or any other account you might be a member of). If you don't see the choice, it is probably because you selected the "automatically select this account next time" option. At the bottom of the screen, there is a Preferences option which allows you to either change that setting or turn it off.
Overall Flow
1. A request to generate a specific number of patients is queued to the generator queue, thus control returns to the user/caller immediately.
2. The generator starts by choosing a family name which it displays in the log. The maximum allowed patients to generate is controlled by the tolven.gen.patient.max server property.
3. Each generated patient is then queued to the simulator, which eventually generates CCR messages and queues the generated CCR messages to the "Rule queue" just as if those messages had come from a real source system. Therefore, at this point, the generator is no longer in the picture. The CCR message will contain a full record of the activity for the generated patient.
4. The Rule Queue, which is emptied by the "Evaluator", processes messages from the inbound queue. For each message, the Evaluator will ask each of the plugged-in message processors (trim, ccr, cda, and some others) if it can process the message. If none does, then you should see an error message "No message processor found for...". That would suggest that org.tolven.ccr plug-in is missing from plugins.xml.
5. If the message is handled by the CCR processor, it will insert the resulting "placeholders" extracted from the CCR message into the working memory of the rules engine, along with other important "facts" and then the rules engine goes to work.
6. If there were no rules in working memory, then nothing further happens. However, if the example rules are loaded (which are neither specific to the generator nor to the CCR format), trace messages indicate that certain placeholders are being referenced, the most frequent consequence of rules firing. These messages will be pre-fixed by [AppEvalAdapter].
7. The consequence of these actions is that appropriate database tables are updated.
8. Periodically, the browser polls the server to find out if there are updates that it should be aware of. If so, then the appropriate pane in the browser is updated. If the account has disabled auto-update (rare), then the user must manually refresh the page to see updates.

