[NEXT-1314] High memory usage in deployed Next.js project #49929

ProchaLu · 2023-05-17T12:28:24Z

Verify canary release

I verified that the issue exists in the latest Next.js canary release

Provide environment information

Operating System:
      Platform: darwin
      Arch: arm64
      Version: Darwin Kernel Version 22.1.0: Sun Oct  9 20:14:30 PDT 2022; root:xnu-8792.41.9~2/RELEASE_ARM64_T8103
    Binaries:
      Node: 18.15.0
      npm: 9.5.0
      Yarn: 1.22.19
      pnpm: 8.5.0
    Relevant packages:
      next: 13.4.3-canary.1
      eslint-config-next: N/A
      react: 18.2.0
      react-dom: 18.2.0
      typescript: 5.0.4

Which area(s) of Next.js are affected? (leave empty if unsure)

No response

Link to the code that reproduces this issue

https://codesandbox.io/p/github/ProchaLu/next-js-ram-example/

To Reproduce

Clone the repository: https://github.com/ProchaLu/next-js-ram-example
Deploy the project on Fly.io

Describe the Bug

I have been working on a small project to reproduce an issue related to memory usage in Next.js. The project is built using the Next.js canary version 13.4.3-canary.1. It utilizes Next.js with App Router and Server Actions and does not use a database.

The problem arises when deploying the project on different platforms and observing the memory usage behavior. I have deployed the project on multiple platforms for testing purposes, including Vercel and Fly.io.

On Vercel: https://next-js-ram-example.vercel.app/
When interacting with the deployed version on Vercel, the project responds as expected. The memory usage remains stable and does not show any significant increase or latency
On Fly.io: https://memory-test.fly.dev/
However, when deploying the project on Fly.io, I noticed that the memory usage constantly remains around 220 MB, even during normal usage scenarios

Expected Behavior

I expect the small project to run smoothly without encountering any memory-related issues when deployed on various platforms, including Fly.io. Considering the previous successful deployment on Fly.io, which involved additional resource usage and utilized Next.js 13 with App Router and Server Actions, my anticipation is that the memory usage will remain stable and within acceptable limits.

Fly.io discussion: https://community.fly.io/t/high-memory-usage-in-deployed-next-js-project/12954?u=upleveled

Which browser are you using? (if relevant)

Chrome

How are you deploying your application? (if relevant)

Vercel, fly.io

_NEXT-1314

The text was updated successfully, but these errors were encountered:

thexpand · 2023-05-17T21:01:42Z

Is this related to Server Actions, have you isolated the case?

TheBit · 2023-05-22T09:37:33Z

@thexpand This is not related to Server Actions. It is a severe memory leak starting from v13.3.5-canary.9. I was going to open a bug but found this one.

@shuding I suspect your PR #49116 as others in mentioned canary are not likely to cause this. Can you please take a look? This blocks us from upgrading to the latest Next.js.

Operating System:
      Platform: darwin
      Arch: arm64
      Version: Darwin Kernel Version 22.4.0: Mon Mar  6 20:59:58 PST 2023; root:xnu-8796.101.5~3/RELEASE_ARM64_T6020
    Binaries:
      Node: 18.15.0
      npm: 9.5.0
      Yarn: 1.22.19
      pnpm: 7.9.0
    Relevant packages:
      next: 13.4.4-canary.0
      eslint-config-next: N/A
      react: 18.2.0
      react-dom: 18.2.0
      typescript: 4.6.2

Tech Stack:

Rush.js monorepo (on pnpm)
+several Next.js apps in SSG (no SSR, no app dir but with next/image (not legacy one) and with middleware.js and few redirects and rewrites in next.config.js)
+all of which uses latest (v7) common/storybook as 90% of theirs code base (so transpileModules is used for it)
+external configuration and MaterialUI styles comes from Apollo Client used in getStaticProps pointing towards Strapi GraphQL (so on-demand revalidation is used)
and everything deployed to our Kubernetes (build and run-time both are using latest Bullseye Debian so that sharp for next/image is working correctly) and monitored in Grafana
BTW, Next.js is in standalone mode

Proofs:

13.3.5-canary.8 vs 13.3.5-canary.9

13.3.5-canary.8 vs 13.3.5-canary.9 with all images unoptimized

13.3.5-canary.8 vs 13.4.4-canary.0 (to test latest canary) with all images unoptimized + middleware removed

So, as you can see, the leak comes not from next/image or middleware, and the only PR which theoretically could cause this from canary.9 is this as for me: #49116

P.S. I also checked13.3.4 and found no leakage there. But on this version, we get Internal Server Error from middleware so can't use it, so I had to find a minimum canary version where this problem has been fixed - and this version is https://github.com/vercel/next.js/releases/tag/v13.3.5-canary.2, so we lock on this version for now (probably this PR #48723 fixed middleware problem)

Josehower · 2023-05-26T12:54:12Z

@shuding or @ijjk any thoughts on this issue? Can you confirm a current memory leak in Next.js?

Josehower · 2023-06-07T13:23:25Z

I created this reproduction repo using the latest canary version of Next.js for the error documented before. In the repo i am using auto-cannon to request multiple pages very fast, simulating traffic to the website.

i documented this in a new issue since seems a different error #50909

Josehower · 2023-06-09T11:22:40Z

I created a different reproduction repo using the latest canary version of Next.js. The error is crashing the dev server when an import is missing.

https://nextjs.org/docs/messages/module-not-found

<--- Last few GCs --->

[2218:0x5eb9a70]    40167 ms: Mark-sweep 252.1 (263.9) -> 250.1 (263.7) MB, 206.0 / 0.0 ms  (average mu = 0.174, current mu = 0.125) allocation failure scavenge might not succeed
[2218:0x5eb9a70]    40404 ms: Mark-sweep 252.4 (263.9) -> 250.6 (264.2) MB, 216.7 / 0.0 ms  (average mu = 0.135, current mu = 0.086) allocation failure scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0xb02930 node::Abort() [/usr/local/bin/node]
 2: 0xa18149 node::FatalError(char const*, char const*) [/usr/local/bin/node]
 3: 0xcdd16e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
 4: 0xcdd4e7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
 5: 0xe94b55  [/usr/local/bin/node]
 6: 0xe95636  [/usr/local/bin/node]
 7: 0xea3b5e  [/usr/local/bin/node]
 8: 0xea45a0 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/bin/node]
 9: 0xea751e v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node]
10: 0xe68a5a v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/usr/local/bin/node]
11: 0x11e17c6 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/usr/local/bin/node]
12: 0x15d5439  [/usr/local/bin/node]

I documented this in a new issue since it seems a different error #51025

Josehower · 2023-06-13T09:09:31Z

I created a reproduction using Docker to showcase how a Simple project using Next.js crashes when being used in environtments with ~225MB.

Steps to reproduce:

run docker pull josehower/next-js-memory-leak-reproduction-example:latest
run docker run -p 3000:3000 --memory=225m josehower/next-js-memory-leak-reproduction-example:latest
- NOTE: in some environments the app is not even running with this memory restriction in this case add more memory --memory=256m
visit http://localhost:3000/
click fire
confirm the app is turning unresponsive and throwing the following error

Error: socket hang up
    at connResetException (node:internal/errors:717:14)
    at Socket.socketOnEnd (node:_http_client:526:23)
    at Socket.emit (node:events:525:35)
    at endReadableNT (node:internal/streams/readable:1359:12)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
  code: 'ECONNRESET'
}

you can confirm the app is running when removing the --memory=225m option from the command

The way I recreated this Reproduction was by creating a Docker image from a simple Next.js app using autocannon to fake traffic to the website from a button.

Created the reproduction repo
Adding a Dockerfile and setup the image
building with docker build . -t <username>/<image-name>:<version>
publishing to Docker with docker push docker push <username>/<image-name>:<version>

Starefossen · 2023-06-14T08:38:18Z

Is this related to #49677 maybe?

TheBit · 2023-06-14T13:17:14Z

@Starefossen thanx! I've read that issue and found one connection with my setup: #49677 (comment) I also run my Next.js with NODE_OPTIONS='--dns-result-order=ipv4first' due to this, which was done here: nodejs/node#39987, and our migration from Node 16 to 18.

I will try to find time and repeat my upgrade without this option or some other change related to this. I can see that it already fixes the problem for this developer.

karl-run · 2023-06-15T13:46:35Z

Here are two pods in a k8s-cluster, the two first lines are my pages based branch, vs default memory usage of my app-dir based branch. Literally 10x from start before any requests. The app-dir branch also hits 7-800mb of usage after a while.

karl-run · 2023-06-16T09:38:56Z

It also jumps up to 2x the ram usage after a single request. Example of a pages dir vs app dir deployed app:

gopherine · 2023-06-16T09:51:53Z

+1 happening the same on our systems , no dev in my team with 8Gigs of ram is able to work with it, this is also happening specifically when we are using app router , its kind of painful to switch to different routing definition back and forth. next version 13.4.4

gabemeola · 2023-06-17T03:08:34Z

Usage graph running on railway. June 15th image optimization was disabled. Also cache rate drastically increase — may be related.

Not sure how much of an issue this is in serverless land since processes don't run long enough to have memory leaks.

Similar: #44685

SanderRuusmaa · 2023-06-21T11:50:11Z

+1 same situation can be seen in our deployed next js app :/

hampuskraft · 2023-06-21T15:58:15Z

Same issue here, running on Next.js 13.4.6 deployed on Fly.io. I worked around the problem by allocating 2048 MiB of memory to the instance and a 512 MiB swap as a buffer. As you can see, I'm only delaying the inevitable OOM, but this at least makes the issue much less frequent.

You can find the source code here: https://github.com/hampuskraft/arewepomeloyet.com.

LukeTarr · 2023-06-21T17:05:59Z

Yep I can confirm this seems to be a leak somewhere. I'm running a super basic Next server on Railway and you can see the memory usage at completely idle here:

Here's a list of packages and versions being used if this helps anyone debug:

`

"@clerk/nextjs": "^4.21.7",

"@types/node": "20.3.1",

"@types/react": "18.2.13",

"@types/react-dom": "18.2.6",

"autoprefixer": "10.4.14",

"eslint": "8.43.0",

"eslint-config-next": "13.4.6",

"next": "13.4.6",

"postcss": "8.4.24",

"react": "18.2.0",

"react-dom": "18.2.0",

"tailwindcss": "3.3.2",

"typescript": "5.1.3"

`

karl-run · 2023-06-22T12:07:10Z

Tried the newest 13.4.7, things look roughly the same:

Here's another example of a pod that has min: 1 - max: 2 replica, where the green has been alive for a while, where the yellow came up and initially used 300MB, then as soon as a single request hit it it jumped to 520MB.

This app isn't using a single next/image-component. So it's definitely not related to that.

Here's the same app in production that's actually getting a few thousand visits:

remorses · 2023-06-22T12:15:16Z

Try uninstalling sharp, it made memory usage much lower better in my case

hampuskraft · 2023-06-22T12:16:43Z

I already uninstalled sharp and disabled image optimization, it didn't help in my case.

broksonic21 · 2023-06-22T12:57:43Z

Not sure if this is next's supported solution, so YMMV, but only thing that helped us on 13.4.4+ was to set:

  experimental: {
    appDir: false 
    // this also controls running workers https://github.com/vercel/next.js/issues/45508#issuecomment-1597087133, which is causing
    // memory issues in 13.4.4 so until that's fixed, we don't want this.
  },

That disables the new appdir support which became the default in 13.4, but also turns off the extra workers. It also fixed the leaked socket issue calling crashes/timeout issue (#51560 ), which appears related - the extra processes (see #45508 for build, but also next start) are leaking as far as I can tell, causing everyone's memory issues. Might not be exact cause, but highly correlated for sure.

hampuskraft · 2023-06-22T13:01:53Z

This issue is indeed about the App Router (appDir) feature. The feature is supposedly stable (but it's clear that it isn't), which is why it was adopted. Turning it off would require rewriting our codebases.

broksonic21 · 2023-06-22T15:23:05Z

@hampuskraft I should say -our site is using pages - we haven't done any work for app yet (but were still broken unless we turned off the appDir experimental feature/default.

timneutkens · 2023-08-11T07:57:12Z

@tghpereira again, please read my earlier posts...

If you're running into issues with memory usage in development please follow this issue instead: #46756 (similar to this one, please provide the source code or a heap profile).

This should fix the memleak issue we are seeing See vercel/next.js#49929

timneutkens · 2023-08-23T13:53:50Z

Hey everyone,
Got another update on this, we've landed the changes to reduce the amount of processes from 3 to 2:

One for routing, App Router rendering
One for Pages Router rendering (see my previous posts for reasoning why this needs a separate process)

It's out on next@canary, please give it a try.

We've also made a change to the implementation using Sharp to reduce the amount of concurrency it handles (usually it would take all cpus). That should help a bit with peak memory usage when using Image Optimization. I'd like to get a reproduction for the Image Optimization causing high memory usage so that it can be investigated in a new issue so if someone has that please provide it.

With these changes landed I think it's time to close this issue as these changes cover the majority of comments posted. We can post a new issue specifically tracking memory usage with image optimization. There is a separate issue for development memory usage already.

karlhorky · 2023-08-23T14:46:17Z

Thanks Tim, I upgraded the original projects @ProchaLu mentioned in the OP (the ones deployed on the free Fly.io machines with 256MB RAM) to [email protected] and will keep an eye on our alerts to see if this stops the crashing:

Ref: vercel/next.js#49929 (comment)

karlhorky · 2023-08-23T15:31:25Z

cc Fly.io folks @michaeldwan @rubys @jeromegn @dangra @mrkurt so that you're aware that deploying Next.js apps with App Router can lead to OOM (Out of Memory) errors on Fly.io with the free tier ("Free allowances") with 256MB RAM - in case this would represent a business reason for Fly.io to upgrade the base free allowance RAM to 512MB

As mentioned in my last message above, we have now upgraded to the latest Next.js version, and I have yet to see a crash on Fly.io because of OOM errors, but in case the issue persists after some time, you may also hear this from other customers in the future.

dangra · 2023-08-23T16:49:08Z

@karlhorky thanks for letting us know.

FTR: for the case of apps running on 256MB machines, adding swap memory usually helps https://fly.io/docs/reference/configuration/#swap_size_mb-option

karlhorky · 2023-08-23T17:10:00Z

Thanks for the extra tip about the swap memory - we also tried that as well, after getting that tip in the community post

dangra · 2023-08-24T06:28:04Z

@karlhorky all good, the only nuance is that that post describes how to setup swap space manually. While the link I shared only requires adding a swap_size_mb = 512 directive in fly.toml and redeploy. The swap space will be automatically setup before your app starts, no changes to Dockerfile needed.

timneutkens · 2023-08-24T08:46:12Z

Going to close this issue as mentioned yesterday as all changes / investigation has been landed and there haven't been new reports since shipping my changes earlier.

Keep in mind that Node.js below 18.17.0 has a memory leak in fetch() so you have to upgrade Node.js too.

I've opened a separate issue about the Image Optimization memory usage, we'll need a reproduction there, if it's not provided the issue will auto-close so please provide one, thank you! Link: #54482.

Thanks to everyone that provided a reproduction that we could actually investigate.

ProchaLu added the bug Issue was opened via the bug report template. label May 17, 2023

ProchaLu changed the title ~~High memory usage in Deployed Next.js Project~~ High memory usage in deployed Next.js project May 17, 2023

Josehower mentioned this issue May 23, 2023

Project deployed to fly.io runs out of memory upleveled/next-js-example-winter-2023-atvie#18

Open

This comment was marked as resolved.

Sign in to view

This was referenced Jun 7, 2023

dev server warning of a memory leak #50909

Closed

dev server crashes with out of memory error when missing import in page #51025

Closed

TheBit mentioned this issue Jun 14, 2023

ECONNREFUSED when starting vanilla installation #49677

Closed

1 task

This was referenced Jun 22, 2023

Next 13.4: Error: connect ETIMEDOUT ::1:50720 at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1494:16) #51560

Closed

Hight number of processes of /next/dist/compiled/jest-worker/processChild.js still alive after next build #45508

Closed

karlhorky mentioned this issue Aug 10, 2023

Upgrade Next.js to latest canary version upleveled/security-vulnerability-examples-next-js-postgres#108

Merged

This comment was marked as off-topic.

Sign in to view

This comment was marked as outdated.

Sign in to view

This comment was marked as off-topic.

Sign in to view

balazsorban44 mentioned this issue Aug 16, 2023

Memory leak updating 13.4.12 to 13.4.16. #54104

Closed

1 task

renchap added a commit to renchap/joinmastodon that referenced this issue Aug 17, 2023

Update to Node 20

0a9ee96

This should fix the memleak issue we are seeing See vercel/next.js#49929

This was referenced Aug 17, 2023

Update to Node 20 renchap/joinmastodon#1

Closed

Upgrade to Node 18 mastodon/joinmastodon#401

Merged

renchap added a commit to renchap/joinmastodon that referenced this issue Aug 17, 2023

Update to Node 20

d682b09

This should fix the memleak issue we are seeing See vercel/next.js#49929

renchap added a commit to renchap/joinmastodon that referenced this issue Aug 17, 2023

Update to Node 18

5caede9

This should fix the memleak issue we are seeing See vercel/next.js#49929

renchap added a commit to renchap/joinmastodon that referenced this issue Aug 17, 2023

Update to Node 18

9e65cf5

This should fix the memleak issue we are seeing See vercel/next.js#49929

renchap mentioned this issue Aug 17, 2023

Update dependency next to v13.4.17 mastodon/joinmastodon#407

Merged

1 task

alfredosalzillo mentioned this issue Aug 18, 2023

build will yield "Failed to set fetch cache" #53695

Closed

1 task

karlhorky added a commit to upleveled/security-vulnerability-examples-next-js-postgres that referenced this issue Aug 23, 2023

Upgrade Next.js to latest canary

446d1c8

Ref: vercel/next.js#49929 (comment)

karlhorky mentioned this issue Aug 23, 2023

Use JSDOM alternative for deploying to Vercel upleveled/security-vulnerability-examples-next-js-postgres#95

Draft

6 tasks

timneutkens mentioned this issue Aug 24, 2023

Investigate memory usage with Image Optimization enabled #54482

Open

1 task

timneutkens closed this as completed Aug 24, 2023

vercel locked and limited conversation to collaborators Aug 24, 2023

[NEXT-1314] High memory usage in deployed Next.js project #49929

[NEXT-1314] High memory usage in deployed Next.js project #49929

Comments

ProchaLu commented May 17, 2023 • edited by timneutkens Loading

Verify canary release

Provide environment information

Which area(s) of Next.js are affected? (leave empty if unsure)

Link to the code that reproduces this issue

To Reproduce

Describe the Bug

Expected Behavior

Which browser are you using? (if relevant)

How are you deploying your application? (if relevant)

thexpand commented May 17, 2023

TheBit commented May 22, 2023 • edited Loading

Tech Stack:

Proofs:

13.3.5-canary.8 vs 13.3.5-canary.9

13.3.5-canary.8 vs 13.3.5-canary.9 with all images unoptimized

13.3.5-canary.8 vs 13.4.4-canary.0 (to test latest canary) with all images unoptimized + middleware removed

Josehower commented May 26, 2023

This comment was marked as resolved.

Josehower commented Jun 7, 2023 • edited Loading

Josehower commented Jun 9, 2023 • edited Loading

Josehower commented Jun 13, 2023 • edited Loading

Starefossen commented Jun 14, 2023

TheBit commented Jun 14, 2023

karl-run commented Jun 15, 2023

karl-run commented Jun 16, 2023

gopherine commented Jun 16, 2023 • edited Loading

gabemeola commented Jun 17, 2023 • edited Loading

SanderRuusmaa commented Jun 21, 2023

hampuskraft commented Jun 21, 2023

LukeTarr commented Jun 21, 2023

karl-run commented Jun 22, 2023 • edited Loading

remorses commented Jun 22, 2023

hampuskraft commented Jun 22, 2023

broksonic21 commented Jun 22, 2023

hampuskraft commented Jun 22, 2023

broksonic21 commented Jun 22, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as outdated.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

timneutkens commented Aug 11, 2023

timneutkens commented Aug 23, 2023

karlhorky commented Aug 23, 2023 • edited Loading

karlhorky commented Aug 23, 2023 • edited Loading

dangra commented Aug 23, 2023

karlhorky commented Aug 23, 2023

dangra commented Aug 24, 2023

timneutkens commented Aug 24, 2023 • edited Loading

ProchaLu commented May 17, 2023 •

edited by timneutkens

Loading

TheBit commented May 22, 2023 •

edited

Loading

Josehower commented Jun 7, 2023 •

edited

Loading

Josehower commented Jun 9, 2023 •

edited

Loading

Josehower commented Jun 13, 2023 •

edited

Loading

gopherine commented Jun 16, 2023 •

edited

Loading

gabemeola commented Jun 17, 2023 •

edited

Loading

karl-run commented Jun 22, 2023 •

edited

Loading

karlhorky commented Aug 23, 2023 •

edited

Loading

karlhorky commented Aug 23, 2023 •

edited

Loading

timneutkens commented Aug 24, 2023 •

edited

Loading