Skip to content

Latest commit

 

History

History
104 lines (70 loc) · 4.61 KB

theoretical_foundations.md

File metadata and controls

104 lines (70 loc) · 4.61 KB

Theoretical foundations

Weather predictor is the machine that calculates new weather based on the forecasting abilities of several weather services.

Formulating a task

For simplicity, we assume that we have only two weather services, let's name them 'EnglishWeather' and 'FrenchWeather'. We, also, assume that we have weather service that generate true weather data; let's name it as 'TrulyTruthService'.

We assume that we already collected forecast on N days of 'EnglishWeather' and 'FrenchWeather' services and we have true values on these N days.

Our goal is using this information calculate weight for every service and using these weights calculate more precise weather prediction for every new data generated by these services (in out case 'EnglishWeather' and 'FrenchWeather').

Also, for simplicity we'll consider only air temperature as weather that should be forecasted.

Looking for solution

So, let's think. In order to calculate weights of services we need to know how good weather services forecast weather. What if for each service ('EnglishWeather' and 'FrenchWeather') we will calculate forecast error or saying another words how far forecast values of specific service is from 'TrulyTruthService' values?

Lets calculate error for 'EnglishWeather' with this formula

formula

Where op is calculated error, op is amount of values, op is op'th value of 'EnglishWeather' service, op is op'th value of 'TrulyTruthService' service, summation is over the op index, op.

We'll use the same formula for 'FrenchWeather' service and we'll got the error op.

Now, we've got errors with which services forecast weather. It's obvious that the less error service produces the bigger weight it should have.

Let's write services errors in one vector op.

formula

Normalize it and inverse according to 1 (and then normalize gained vector). Normalization is proportional scaling of all elements in vector that sum of these elements be equal to 1.

formula

Now, we've got services errors related to each other. The more error closer to 1 the more precise weather forecast service provides.

Let's calculate more precise weather forecast using these weights. Such as weights vector is normalized to 1 this is very simple to do it. Just sum new forecast of services weighted by their weights, i. e.

formula

Now, we can easily scale this solution on any arbitrary amount of services.

Example

Let's look at the toy example. Suppose we collected weather for 25.03.2018 that

  • 'EnglishWeather' - 4 degrees Celsius
  • 'FrenchWeather' - 6 degrees Celsius

And the real temperature that was in 25.03.2018

  • 'TrulyTruthService' - 4 degrees Celsius.

Can you already guess which weights services will have?

So, firstly we calculate error vector:

formula

And normalize it via this formula formula

formula

As expected 'EnglishWeather' has total confidence because it has 100% coincidence with true values, while 'FrenchWeather' has 0% of coincidence with true values, so its weight is equal to 0. And calculating the new weather forecast by formula formula we've got

formula