17 results
tagged *
statistics
✕
*

"Local git statistics including GitHub-like contributions calendars."

When you first start reading about Brave, you learn that it is a new reward system for publishers and a new advertising model.

You may wondered how many publishers are there, and who they were.

batgrowth.com scrapes the web to list websites that are BAT publishers.

You will learn in this post how to:

- decompose double-seasonal time series
- detrend time series
- model and forecast double-seasonal time series with trend
- use two types of simple regression trees
- set important hyperparameters related to regression tree

This web site contains notes and materials for an advanced elective course on statistical forecasting that is taught at the Fuqua School of Business, Duke University. It covers linear regression and time series forecasting models as well as general principles of thoughtful data analysis.

The time series material is illustrated with output produced by Statgraphics, a statistical software package that is highly interactive and has good features for testing and comparing models, including a parallel-model forecasting procedure that I designed many years ago.

The material on multivariate data analysis and linear regression is illustrated with output produced by RegressIt, a free Excel add-in which I also designed. However, these notes are platform-independent. Any statistical software package ought to provide the analytical capabilities needed for the various topics covered here.

There may be complex and unknown relationships between the variables in your dataset.

It is important to discover and quantify the degree to which variables in your dataset are dependent upon each other. This knowledge can help you better prepare your data to meet the expectations of machine learning algorithms, such as linear regression, whose performance will degrade with the presence of these interdependencies.

In this tutorial, you will discover that correlation is the statistical summary of the relationship between variables and how to calculate it for different types variables and relationships.

After completing this tutorial, you will know:

- How to calculate a covariance matrix to summarize the linear relationship between two or more variables.
- How to calculate the Pearson’s correlation coefficient to summarize the linear relationship between two variables.
- How to calculate the Spearman’s correlation coefficient to summarize the monotonic relationship between two variables.

A receiver operating characteristic (ROC) is a graph that illustrates the performance of a binary classifier as its discrimination threshold (cutoff) is changed.

The curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various cutoff settings. The true-positive rate is known as sensitivity, the false-positive rate is known as the fall-out and is calculated as (1 - specificity).

The ROC curve is thus a plot of the true positives (TPR) versus the false positives (FPR). The ROC curve can be generated by plotting the cumulative distribution function (area under the probability distribution from - ∞ to + ∞ ) of the correct detection probability in the y-axis versus the cumulative distribution function of the false-alarm probability in x-axis.

Glossary

- GP: Games Played
- MPG: Minutes Per Game
- ORPM: Player's estimated on-court impact on team offensive performance, measured in points scored per 100 offensive possessions
- DRPM: Player's estimated on-court impact on team defensive performance, measured in points allowed per 100 defensive possessions
- RPM: Player's estimated on-court impact on team performance, measured in net point differential per 100 offensive and defensive possessions. RPM takes into account teammates, opponents and additional factors
- WAR: The estimated number of team wins attributable to each player, based on RPM

Discover, Track and Compare Open Source.