-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
na_kalman is slow for long time series #42
Comments
Hey @jonekeat sorry for the late answer, been quite busy. I definitely know about these performance issues of na.kalman and I am also not very satisfied with them. In comparison to the other methods, na.kalman is of course also just more complex and will probably always be way slower than e.g. na.interpolation. What I would have to do is to change to another implementation or library for the Kalman Filter related things. (this is the current bottle neck). Nice benefit: this would also solve another issue with a certain error occurring for an edge case. I have this in mind and I think an improvement will come - but don't expect it short term. |
Maybe this package could help with performance? |
This looks interesting @trafficonese - could solve our problems. Any idea, why it isn't on CRAN? I'd say, since rcppkalman is not on CRAN the easiest solution to integrate these functions would be to copy the needed sources files (only these) into the imputeTS package. On the rcppkalman Github it says: I'd guess if we mark the origin of the copied files in the source code this should be alright, since imputeTS is also GNU GPL 3. What do you think about this @trafficonese |
It might be worth a try to test the functions and benchmark them with the current solution. For now, I was testing the given example and found that it's actually the function When you do the original example with |
rcppkalman still isn't on CRAN, but seems they are actively developing it. I also found Package ‘FKF’ Fast Kalman Filter. I'll have a look if this can be used. |
Line 200 of file na_kalman.R: |
Ah nice find. You just realized you first wrote here and then in a new thread. I see your point - when looking at the StructTS source code it seems indeed, there is one unecesary call to I think I could edit the source code of StructTS remove the unnecessary parts and add it to the imputeTS package - but that would mean I have to maintain it on my own then. It also seems to my, this part does not really contribute that much to computing time.
part in StructTS consumes nearly ALL of the computing time ...with everything else basically being irrelevant. But I have to check this again...the profiling was only a quick try. |
I hope this message will get you well. I am trying to do missing imputing using: na_kalman(forex_ts, model = "auto.arima", smooth = TRUE) and forex_kalman1<- na_kalman(forex_ts, model = "StructTS", smooth = TRUE) but it is taking a long time to perform the run (more than 3 hours), my laptop is new (Macbook pro 2020). Is there are any suggestions to reduce some features (e.g. reducing iteration or something like that)? Regards, Ahmad |
Hi @SteffenMoritz
Thanks for the amazing package ImputeTS. However, I found it to be slow when imputing long time series (~3000 daily data) with
na_kalman(x, model = "StructTS")
Below is the reproducible example:
series <- ts(rnorm(3000), start = c(2000, 1), frequency = 365.25)
sample <- sample(1:3000, 900)
series[sample] <- NA
na_kalman(series)
Related Stackoverflow question:
https://stackoverflow.com/questions/52841828/why-is-imputets-hanging-taking-so-long-to-na-kalman-this-data-set
Is there any planned development to solve these performance issues when encountering long time series?
The text was updated successfully, but these errors were encountered: