Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: fetch / cache timetable data on core side (like how we handle infra loading) #11019

Open
eckter opened this issue Mar 4, 2025 · 2 comments
Labels
area:core Work on Core Service area:editoast Work on Editoast Service kind:enhancement Improvement of existing features kind:performance Reduction of computing time or memory use kind:refacto-task Task related to Refactorization Epic module:stdcm Short-Term DCM

Comments

@eckter
Copy link
Contributor

eckter commented Mar 4, 2025

Who would benefit from this feature?

Both

What is this feature about?

It's a possible refactoring I have mentioned several times IRL, but I hadn't created an issue yet (or I lost track of it). This is a suggestion to discuss, not something we know we want to do.

For an STDCM request on a real-life timetable, we spend a lot of time building, sending, and parsing the request (tens of seconds). The reason for that is the "train requirements", a massive list of all the resources used by the train in the timetable. In this trace, we spent 52s (!) just parsing the request.

We could handle it like we handle infras. Core only loads it when required and then cache the values. The main STDCM request would only contain the actual inputs (origin/destination and times).

There would be some refactoring to do, like how we keep track of timetable changes, or how we may only load a subset of the timetable.

Why is this feature valuable?

  1. It would reduce the massive overhead of building the request, saving a lot of time for each call
  2. It would make it easier to save and reproduce past requests (if and only if we save timetable contents)
  3. It would help with memory use when running parallel STDCM requests on a single worker (which currently causes OOM errors, but we may block that case entirely)
@eckter eckter added kind:enhancement Improvement of existing features area:core Work on Core Service area:editoast Work on Editoast Service module:stdcm Short-Term DCM kind:refacto-task Task related to Refactorization Epic kind:performance Reduction of computing time or memory use labels Mar 4, 2025
@Khoyo
Copy link
Contributor

Khoyo commented Mar 4, 2025

A potential problem I see with this is that currently editoast has a filter on which trains req it sends to stdcm (based on earliest departure/latest arrival) - see this and

let earliest_departure_time = stdcm_request.get_earliest_departure_time(simulation_run_time);
let latest_simulation_end = stdcm_request.get_latest_simulation_end(simulation_run_time);
let timetable = Timetable::retrieve_or_fail(&mut conn, timetable_id, || {
StdcmError::TimetableNotFound { timetable_id }
})
.await?;
let train_schedules = timetable
.schedules_in_time_window(&mut conn, earliest_departure_time, latest_simulation_end)
.await?;
.

Would taking the whole timetable impact stdcm performance? I'm not that worried about timetable updates, they are expected to be infrequent in real life usage.

(I'm also guessing that the sql query might potentially take a long time since we have no index on departure and arrival time, but I haven't measured anything yet)

@eckter
Copy link
Contributor Author

eckter commented Mar 4, 2025

A potential problem I see with this is that currently editoast has a filter on which trains req it sends to stdcm (based on earliest departure/latest arrival)

What I had in mind would be to keep track of which time ranges are "covered" by the cached elements.

Like if we send a request that needs the timetable from 9:00 to 15:00, we'd fetch those trains and cache them. Then if we try an hour later, we'd request trains between 15:00 and 16:00. Some trains would appear twice (any train that would be running at 15:00), but we can probably deduplicate without too much issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core Work on Core Service area:editoast Work on Editoast Service kind:enhancement Improvement of existing features kind:performance Reduction of computing time or memory use kind:refacto-task Task related to Refactorization Epic module:stdcm Short-Term DCM
Projects
None yet
Development

No branches or pull requests

2 participants