Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POST to monitormywatershed.org broken #658

Closed
neilh10 opened this issue May 17, 2023 · 6 comments
Closed

POST to monitormywatershed.org broken #658

neilh10 opened this issue May 17, 2023 · 6 comments
Assignees
Labels

Comments

@neilh10
Copy link

neilh10 commented May 17, 2023

In using POSTs to monitormywatershed.org I'm getting a 301
This started happening after the upgrade on March 12th upgrade v0.15.0?

#316
#542

On a POST to monitormywatershed.org

Connected Internet

pubDQTR Sending data to [ 0 ] monitormywatershed.org:80
POST /api/data-stream/ HTTP/1.1
Host: monitormywatershed.org
TOKEN: 0cf7c40a-232e-457d-87d6-cea5c0757fec
Content-Length: 409
Content-Type: application/json

{"sampling_feature":"236c674b-69b9-43af-b0d6-33d67b870ecc","timestamp":"2023-05-17T10:12:06-08:00","8c57835f-a32f-4d62-82dc-0ba09f04cf52":1,"3bebd4a3-8b54-4f92-ba55-5fd2fd021358":3.927,"03e7b375-97a7-4423-a3f0-1d822d8b19b9":18.00,"43bcda9b-2973-4639-af2c-f0b6bb3fa44b":0.1467,"08646cc3-c5de-414c-af65-c795b2dcac24":59.82,"8849814d-1603-4a2f-861f-f31ae68cccf3":19.35,"7182846e-46e0-4a10-b110-9bc32de4aca9":-18}

-- Response Code -- 301 waited  326 mS Timeout 8000

when I change to data.envirodiy.org

Connected Internet

pubDQTR Sending data to [ 0 ] data.envirodiy.org:80
POST /api/data-stream/ HTTP/1.1
Host: data.envirodiy.org
TOKEN: 0cf7c40a-232e-457d-87d6-cea5c0757fec
Content-Length: 409
Content-Type: application/json

{"sampling_feature":"236c674b-69b9-43af-b0d6-33d67b870ecc","timestamp":"2023-05-17T10:37:43-08:00","8c57835f-a32f-4d62-82dc-0ba09f04cf52":1,"3bebd4a3-8b54-4f92-ba55-5fd2fd021358":3.927,"03e7b375-97a7-4423-a3f0-1d822d8b19b9":18.10,"43bcda9b-2973-4639-af2c-f0b6bb3fa44b":0.1468,"08646cc3-c5de-414c-af65-c795b2dcac24":59.13,"8849814d-1603-4a2f-861f-f31ae68cccf3":19.62,"7182846e-46e0-4a10-b110-9bc32de4aca9":-20}

-- Response Code -- 201 waited  577 mS Timeout 8000

@ptomasula
Copy link
Member

ptomasula commented May 23, 2023

Thanks @neilh10, this is very helpful information!

I was initially confused by this behavior because our git history shows no changes to nginx config file. Nginx presently handles HTTP/S routing for us. Upon further inspection of the settings file on production server, I found a block at the top of the specification which explicitly reroutes only http traffic to monitormywatershed.org to the https equivalent.

Generally speaking that is the behavior we want for most site traffic, however I have an explicit exception to not reroute */api/ traffic because the current upload protocol was not designed to handle the 301 redirect. I have removed that block from the nginx config, so I would expect your original post to monitormywatershed.org will now work correctly, and not return a 301.

I'll look into the how that block made its way into our production config. It is possible that our certificate management system automatically populated that block, but I'll look into it further.

Would you be able to try your original post again and verify the correct behavior?

@ptomasula ptomasula self-assigned this May 23, 2023
@ptomasula ptomasula added the bug label May 23, 2023
@neilh10
Copy link
Author

neilh10 commented May 23, 2023

Hi @ptomasula thanks for the looking at it.
Hurrah data on the stopped fields sites is starting to flow again

The sites that stopped delivering data Apr 12
https://monitormywatershed.org/sites/nh_LCC45/
https://monitormywatershed.org/sites/TUCA_PO03/
https://monitormywatershed.org/sites/TUCA_Sa01/
as described here https://www.envirodiy.org/topic/systems-not-recognized-from-12th-v0-15-0/
have now all recorded a series of the latest readings - indicating that POSTs to monitormywatershed.org:80 are getting through!!

The reliable delivery algorithm's that I've implemented should now kick in -
for https://monitormywatershed.org/sites/nh_LCC45/ this will attempt to upload 100 readings every 15minutes.
For the other two are communicating every 1hour, and then upload 100 readings.

#485

However, trying from my test station at my desk,
POST /api/data-stream/ HTTP/1.1
Host: monitormywatershed.org

I'm getting timeouts after 8seconds (then lengthed it to 10seconds)
[2023-05-23 10:10:32.215] -- Response Code -- 504 waited 8012 mS Timeout 8000
[2023-05-23 10:16:24.737] -- Response Code -- 504 waited 10011 mS Timeout 10000

Then suddenly it accepts the POST and responds in 700mS

[2023-05-23 10:14:22.397] -- Response Code -- 201 waited 770 mS Timeout 10000
so mostly working .. :)

@ptomasula
Copy link
Member

Thanks @neilh10, glad to hear it is mostly working now. A 504 response is a gateway timeout. I have received a few notifications about brief performance issues this afternoon, so it is likely related to that.

The performance slowdowns might actually be an artifact of the batch upload. When that algorithm attempts to reupload, I assume it issues each data point as a separate request? I don't think the endpoint support batch upload, so I think it would have to be multiple requests. We have a request to support that, but have not implemented that yet.

@neilh10
Copy link
Author

neilh10 commented May 23, 2023

@ptomasula seems to be delivering slowly .

The core architectural discussion seems to me to be "reliable delivery", and then it can be broken into separate architectural components. If the server benefits from multiple uploads per tcp/ip session then that would be one approach.
What the first pass of my algorithm does, is multiple upload attempts per physical connection - that is setup the telecom wireless link connecting to the internet, and then (I think) it sets up the tcp/ip, uploads reading, tears down - then if 201, waits programmable timout, typ=1second with the physical connection up, then repeats.
When queue empty, or non 201, or low on power, it close the internet connection.

for LCC45 these are defined as
LOGGING_INTERVAL_MINUTES=15
COLLECT_READINGS=0 ; Number of readings to collect before send 0to30
SEND_OFFSET_MIN=0 ;minutes to wait after collection complete 0-30
POST_MAX_NUM =100 ;On POSTing MAX NUM after which defered next connection

A timeout also represents a lot of used power.

For the https://monitormywatershed.org/sites/nh_LCC45/ downloading the historical data, it seems it is getting a lot of timeouts. In 90minutes of elapsed time, 6 upload attempts at 15minutes, Its only uploaded 3 historical data items. It is set to upload a max of 100 items each attempt - so it could have uploaded 6*100 historical items.
For all the failures below, (main) ModualSensors could lose the data.

Here is the data from my test system, set to upload every 2min, starting with a back log of data, and first few noted

[2023-05-23 10:14:14.770] -- Response Code -- 201 waited 938 mS Timeout 10000
[2023-05-23 10:14:19.023] -- Response Code -- 201 waited 627 mS Timeout 10000
[2023-05-23 10:14:22.397] -- Response Code -- 201 waited 770 mS Timeout 10000
[2023-05-23 10:14:25.565] -- Response Code -- 201 waited 566 mS Timeout 10000 - upload 1+3 queue data

[2023-05-23 10:16:24.737] -- Response Code -- 504 waited 10011 mS Timeout 10000 - failed, que reading
[2023-05-23 10:18:23.625] -- Response Code -- 201 waited 3770 mS Timeout 10000
[2023-05-23 10:18:27.956] -- Response Code -- 201 waited 723 mS Timeout 10000 - upload 1+ 1 queue data
[2023-05-23 10:20:23.964] -- Response Code -- 201 waited 8974 mS Timeout 10000
[2023-05-23 10:22:24.720] -- Response Code -- 504 waited 10011 mS Timeout 10000- failed, que reading
[2023-05-23 10:24:29.584] -- Response Code -- 504 waited 10010 mS Timeout 10000- failed, que reading
[2023-05-23 10:26:29.589] -- Response Code -- 504 waited 10011 mS Timeout 10000- failed, que reading
[2023-05-23 10:28:23.621] -- Response Code -- 201 waited 3795 mS Timeout 10000
[2023-05-23 10:28:27.780] -- Response Code -- 201 waited 543 mS Timeout 10000
[2023-05-23 10:28:31.122] -- Response Code -- 201 waited 723 mS Timeout 10000
[2023-05-23 10:28:34.400] -- Response Code -- 201 waited 676 mS Timeout 10000 - upload 1 + 3 queue data
[2023-05-23 10:30:24.727] -- Response Code -- 504 waited 10011 mS Timeout 10000
[2023-05-23 10:32:21.130] -- Response Code -- 201 waited 4180 mS Timeout 10000
[2023-05-23 10:32:25.458] -- Response Code -- 201 waited 661 mS Timeout 10000 - upload 1 + 3 queue data
[2023-05-23 10:34:18.829] -- Response Code -- 201 waited 3854 mS Timeout 10000
[2023-05-23 10:36:18.812] -- Response Code -- 201 waited 3830 mS Timeout 10000
[2023-05-23 10:38:24.722] -- Response Code -- 504 waited 10010 mS Timeout 10000
[2023-05-23 10:40:29.536] -- Response Code -- 504 waited 10001 mS Timeout 10000
[2023-05-23 10:42:23.582] -- Response Code -- 201 waited 3760 mS Timeout 10000
[2023-05-23 10:42:28.102] -- Response Code -- 201 waited 879 mS Timeout 10000
[2023-05-23 10:42:31.454] -- Response Code -- 201 waited 757 mS Timeout 10000 - upload 1+ 3 queue data
[2023-05-23 10:44:18.519] -- Response Code -- 201 waited 3541 mS Timeout 10000
[2023-05-23 10:46:18.659] -- Response Code -- 201 waited 3687 mS Timeout 10000
[2023-05-23 10:48:23.694] -- Response Code -- 201 waited 3854 mS Timeout 10000
[2023-05-23 10:50:29.556] -- Response Code -- 504 waited 10010 mS Timeout 10000
[2023-05-23 10:52:25.270] -- Response Code -- 201 waited 5434 mS Timeout 10000
[2023-05-23 10:52:29.363] -- Response Code -- 201 waited 471 mS Timeout 10000
[2023-05-23 10:54:18.501] -- Response Code -- 201 waited 3528 mS Timeout 10000
[2023-05-23 10:56:18.820] -- Response Code -- 201 waited 3855 mS Timeout 10000
[2023-05-23 10:58:18.567] -- Response Code -- 201 waited 3613 mS Timeout 10000
[2023-05-23 11:00:29.560] -- Response Code -- 504 waited 10010 mS Timeout 10000
[2023-05-23 11:02:23.277] -- Response Code -- 201 waited 3482 mS Timeout 10000
[2023-05-23 11:02:27.341] -- Response Code -- 201 waited 421 mS Timeout 10000
[2023-05-23 11:04:18.381] -- Response Code -- 201 waited 3434 mS Timeout 10000
[2023-05-23 11:06:18.385] -- Response Code -- 201 waited 3444 mS Timeout 10000
[2023-05-23 11:08:18.378] -- Response Code -- 201 waited 3432 mS Timeout 10000
[2023-05-23 11:10:29.547] -- Response Code -- 504 waited 10011 mS Timeout 10000
[2023-05-23 11:12:23.305] -- Response Code -- 201 waited 3494 mS Timeout 10000
[2023-05-23 11:12:27.337] -- Response Code -- 201 waited 410 mS Timeout 10000
[2023-05-23 11:14:18.419] -- Response Code -- 201 waited 3469 mS Timeout 10000
[2023-05-23 11:16:18.379] -- Response Code -- 201 waited 3434 mS Timeout 10000
[2023-05-23 11:18:18.406] -- Response Code -- 201 waited 3457 mS Timeout 10000
[2023-05-23 11:20:29.507] -- Response Code -- 504 waited 10000 mS Timeout 10000
[2023-05-23 11:22:23.281] -- Response Code -- 201 waited 3446 mS Timeout 10000
[2023-05-23 11:22:27.284] -- Response Code -- 201 waited 397 mS Timeout 10000
[2023-05-23 11:24:18.376] -- Response Code -- 201 waited 3434 mS Timeout 10000
[2023-05-23 11:26:18.364] -- Response Code -- 201 waited 3432 mS Timeout 10000
[2023-05-23 11:28:18.434] -- Response Code -- 201 waited 3457 mS Timeout 10000
[2023-05-23 11:30:29.580] -- Response Code -- 504 waited 10011 mS Timeout 10000
[2023-05-23 11:32:23.303] -- Response Code -- 201 waited 3482 mS Timeout 10000
[2023-05-23 11:32:27.321] -- Response Code -- 201 waited 398 mS Timeout 10000
[2023-05-23 11:34:18.386] -- Response Code -- 201 waited 3420 mS Timeout 10000
[2023-05-23 11:36:18.390] -- Response Code -- 201 waited 3432 mS Timeout 10000
[2023-05-23 11:38:18.394] -- Response Code -- 201 waited 3432 mS Timeout 10000
[2023-05-23 11:40:29.533] -- Response Code -- 504 waited 10001 mS Timeout 10000
[2023-05-23 11:42:23.255] -- Response Code -- 201 waited 3469 mS Timeout 10000
[2023-05-23 11:42:27.320] -- Response Code -- 201 waited 419 mS Timeout 10000
[2023-05-23 11:44:18.485] -- Response Code -- 201 waited 3518 mS Timeout 10000
[2023-05-23 11:46:18.388] -- Response Code -- 201 waited 3420 mS Timeout 10000
[2023-05-23 11:48:18.470] -- Response Code -- 201 waited 3529 mS Timeout 10000
[2023-05-23 11:50:24.821] -- Response Code -- 201 waited 7854 mS Timeout 10000
[2023-05-23 11:52:18.411] -- Response Code -- 201 waited 3459 mS Timeout 10000
[2023-05-23 11:54:18.386] -- Response Code -- 201 waited 3421 mS Timeout 10000
[2023-05-23 11:56:18.453] -- Response Code -- 201 waited 3494 mS Timeout 10000
[2023-05-23 11:58:23.260] -- Response Code -- 201 waited 3469 mS Timeout 10000
[2023-05-23 12:00:21.619] -- Response Code -- 201 waited 6674 mS Timeout 10000
[2023-05-23 12:02:18.391] -- Response Code -- 201 waited 3432 mS Timeout 10000
[2023-05-23 12:04:18.362] -- Response Code -- 201 waited 3408 mS Timeout 10000
[2023-05-23 12:06:23.282] -- Response Code -- 201 waited 3482 mS Timeout 10000
[2023-05-23 12:08:18.359] -- Response Code -- 201 waited 3410 mS Timeout 10000
[2023-05-23 12:10:20.324] -- Response Code -- 201 waited 5372 mS Timeout 10000
[2023-05-23 12:12:18.344] -- Response Code -- 201 waited 3420 mS Timeout 10000
[2023-05-23 12:14:20.386] -- Response Code -- 201 waited 3470 mS Timeout 10000
[2023-05-23 12:16:18.356] -- Response Code -- 201 waited 3433 mS Timeout 10000
[2023-05-23 12:18:18.352] -- Response Code -- 201 waited 3420 mS Timeout 10000
[2023-05-23 12:20:20.046] -- Response Code -- 201 waited 5096 mS Timeout 10000
[2023-05-23 12:22:23.257] -- Response Code -- 201 waited 3471 mS Timeout 10000
[2023-05-23 12:24:18.383] -- Response Code -- 201 waited 3445 mS Timeout 10000
[2023-05-23 12:26:18.358] -- Response Code -- 201 waited 3420 mS Timeout 10000
[2023-05-23 12:28:18.338] -- Response Code -- 201 waited 3420 mS Timeout 10000
[2023-05-23 12:30:29.499] -- Response Code -- 504 waited 10010 mS Timeout 10000
[2023-05-23 12:32:23.236] -- Response Code -- 201 waited 3470 mS Timeout 10000
[2023-05-23 12:32:27.254] -- Response Code -- 201 waited 398 mS Timeout 10000
[2023-05-23 12:34:18.410] -- Response Code -- 201 waited 3529 mS Timeout 10000

@neilh10
Copy link
Author

neilh10 commented May 24, 2023

@ptomasula thanks its still working,
seems I've ended up using a non standard end point - or at least it broke for http://monitormywatershed.org but not for http://data.envirodiy.org

Seemed this was a repeat of #522 (comment)
I'm just wondering is the list of target end points defined anywhere and which is the preferred?

The historical data upload is at a snails pace and even simple POSTs fail with a no response in 10seconds on my test .
There is a suggestion for a simple data base efficiency upgrade - the low hanging fruit so to speak -
tpwrules :
The primary bottleneck with the server in its official incarnation is actually inserting data records into the database due to inefficient use of the ORM and transactions and subsequent timeouts from the lengthy processing. Improving this is pretty simple and results in several times more speed for a single point.
#649 (comment)

@neilh10
Copy link
Author

neilh10 commented May 30, 2023

@ptomasula its still working and I'll close this issue.
The server timeouts are pretty bad and my fields systems are barely managing to upload the data from the months outage - but I'll put that in a separate characterization issue.

Be good to know where the official entrypoints for the server are documented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants