-
Notifications
You must be signed in to change notification settings - Fork 3
Make the API consistent with Statistics.jl/StatsBase.jl #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@huangziwei this is marked as a draft because it is not yet complete, but before I continue, it would be helpful to have a chat (synchronous or asynchronous on here) about the discussion points highlighted above. |
|
@sethaxen cool! I will have a look later and get back to you here soon! |
|
@sethaxen This PR is a good start to polish and push further this package, however, it wouldn't necessarily be a breaking version. Supporting The other thing i had planed is to improve the documentation and CI testing, right now, doc only explain the function signature, without examples, plots. The formula used only mentioned of their reference in comments. it would be good to add the precise math Latex in doc and full references. This can promote the packages among several other similar ones and provide learning materials for users too. It should also be straightforward to add GitHub flows with code coverage. |
I agree that methods taking
Yes this would be nice. The implementations themselves also need some updates to avoid unnecessary allocations, type-promotions, and type-instability, but these updates are orthogonal to this PR, which strictly focuses on API updates. |
|
gosh, time flies so fast and it's been a year, but I still haven't got the time to go through this (and I still can't properly review it as I still don't do Julia at all...). I'd suggest @sethaxen you take over the Julia version (no need to involve me, for now) and we only communicate on the keeping API design level for consistency, if necessary? |
|
@huangziwei, sorry for the delay in replying. I opened this PR as part of my work at the @mlcolab and intended it as part of a larger effort we had discussed last year to improve consistency with both the Julia stats ecosystem and with other circstats packages (where applicable). However, priorities have since changed. I will soon be moving on from my position at the colab and no longer plan to continue the effort; thus it would not be ideal for me to make changes to the package if no-one else is able to maintain those changes. If there's future interest in your group (or from someone else) in refreshing this package, then I think this PR is still a good starting point for that. |
This PR makes a number of breaking changes to make the API of
circ_foomore consistent with its corresponding functionfoo(if defined) in Statistics/StatsBase.Main changes
Supporting
StatsBase.AbstractWeightsWeighted scalar statistics are handled in StatsBase using
StatsBase.AbstractWeightstypes, which can represent not only frequency weights but also other kinds of weights. Unlike the current API, weights in StatsBase are always vectors, so when they are used, thedimskeyword is not supported, and instead an optional positional argumentdim::Intis provided, specifying the single dimension that will be reduced to a singleton. i.e., the signatures areThis PR adopts the same signatures for
circ_foo, which unfortunately does require some code duplication. Future refactors could reduce this code duplication.Avoiding recomputing
circ_meanandcirc_rWhen a function requires
rormean, it now accepts one or both of these as a keyword argument, allowing for some speed-ups. See howcirc_statsdoes this for example. It also addscirc_mean_and_r, analogous tomean_and_varandmean_and_stdin StatsBase.Return a single statistic
Now
circ_meanreturns just the mean, whilecirc_stdandcirc_varaccept akindkeyword to specify the kind of statistic to return. It would be nice if we could do something similar forcirc_skewnessorcirc_kurtosisas well.circ_momentcould be similarly simplified to return the complex moment, but as this isn't consistent with the other moment functions, this doesn't seem ideal.What's left
circ_skewness,circ_kurtosis, andcirc_momentAbstractWeightsweight types.Discussion
Before this PR, this package supported multidimensional arrays of weights. Is there a clear use case for this? JuliaStats/StatsBase.jl#776 discusses adding something similar to StatsBase, but if this happens, it wouldn't happen for some time.
cc @huangziwei, @Meteore