Replies: 2 comments 2 replies
-
|
@carlhiggs 7.5 hours is indeed too much. What were the computing specifications? Specifically, the number of cores and RAM |
Beta Was this translation helpful? Give feedback.
-
|
A likely contributor to slower run times was the implementation of a coefficient lookup instead of hard-coded coefficients for active mode weights. This probably explains why health indicator analysis was particularly slow for walk mode. Using the new test region I was able to prototype and test the back-tracking (#168 ; needed, because the exposure analysis recently started proceeded like the one a month ago, with 1/7 days not processed for health indicators within 4-5 days; too slow). I've restarted the main analysis using the updated code following local tests, and I'll continue to use the test region of Brunswick for cycling scenario. |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The MITO and SILO workflows are computationally intensive, at least when applied at city scale -- for Manchester (2,827,276 persons, and 8,966 output area zones), and perhaps moreso Melbourne (4,174,056 persons, 10,289 SA1 zones).
To speed up the process of development, I previously created sample datasets of 100 randomly selected households (for Melbourne, that came to 231 persons). While that had some advantages for speeding things up, running some processes --- in particular, the current accident and exposure model --- still proved incredibly time intensive, even for the small test population; about 7.5 hours.
I tested a different approach --- instead of taking a small population sample for the full study region, take the full synthetic population sample for a smaller sub-region; for example, the suburb of Brunswick in Melbourne. This corresponded to 23,469 persons (much larger than my prevous toy populations for test purposes) and 56 zones. I found that processing of the same
RunHealthExposureOfflinecode took 5 minutes (SiloMEL took 11 minutes). That was using a 10% Matsim sample, but suggests a 100% Matsim population would also be feasible. This will greatly speed up prototyping and debugging.To do this, I derived Brunswick specific excerpts of all the Melbourne input and microdata. For the synthetic population, I re-allocated persons with jobs and schools outside Brunswick to randomly selected jobs and schools within Brunswick. All amenities, buildings and omx files were restricted to Brunswick, and I re-constructed the network for a 1km buffer around Brunswick. The network reconstruction was a bit complicated; I created a Java class in the Melbourne SILO use case utils module to handle this. In short --- the purpose of this test isn't to represent the 'real' Brunswick, its to create a micro dataset that is otherwise 'realistic', but much more performant. The overall data size is also much smaller; for Melbourne, ~300mb compared to ~30GB.
Not sure if this kind of example region would also be useful for your Manchester test purposes @usr110 @ismailsaadi @TabeaSonnenschein @berdikhanova @BelenZapata85 @JDWoodcock.
The approach was somewhat ad hoc (some modifications were manual, others coded), as it is only for test purposes, and I wasn't sure if the time taken (some hours) would pay off --- but I believe it did.
I've copied the current inputs and outputs (running the health exposure offline runner after yesterday's air pollution update to a "melbourne - brunswick test area" subfolder in the JIBE working group drive.
Here's an example summary of exposures for Brunswick -- you can see, we haven't modelled income for the synthetic population (that's why its constant; its something for future work). I noticed pm25 and no2 exposures reduced a little after the most recent formula update, which makes sense. I also noticed the upper end of noise exposure esitmates is very high, and these have been biased upwards with much larger values since changes earlier in October. I'm not sure if that is from changes related to the accident model branch that I've incorporated, or local to our Melbourne use case. This is something I can now more easily explore with the faster running test region!
I'd love to know the kinds of values you are getting on your side for the noise exposures, and exposures in general.
Beta Was this translation helpful? Give feedback.
All reactions