Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic when driving larger matrixes (one long strip) on 32MB 8MB ESP32-S3-DevKitC1 #580

Open
GameTec-live opened this issue Dec 28, 2023 · 46 comments

Comments

@GameTec-live
Copy link

Bug report

32MB FLASH 8MB PRAM ESP32-S3-DevKitC1(ESP32-S3-DevKitC-1-N32R8V)

Problem

Steps

  1. Modify env:demo to compile properly for the chip (SPIFFS Failing on a 32MB 8MB ESP32-S3-DevKitC1 #579)
[env:demo]
extends         = dev_esp32-s3
build_flags     = -DDEMO=1
                  ${dev_esp32-s3.build_flags}
                  ${psram_flags.build_flags}
board_build.partitions = config/partitions_custom_8M.csv
board_upload.flash_size = 32MB
board_build.flash_mode = qio
board_build.arduino.memory_type = opi_opi
  1. In global.h define a matrix width and height larger than 36 (so 37+ results in the panic)
  2. See core 1 panic and the system reboot

Example

Notes
In this case a "Matrix" is just a bunch of daisychained LED strips going back and forth.

My globals.h
globals.h.txt

Same exact config file works fine on a generic, less powerful, esp32.
And reportedly running 1500 leds on one controller isnt optimal, but it works fine and i get a decent frame rate on my underpowered esp32. It shouldnt crash anyways ;)

Monitor Log:
https://hastebin.skyra.pw/wikevijele.yaml

@rbergen
Copy link
Collaborator

rbergen commented Jan 4, 2024

@GameTec-live The hastebin link at the end of your description leads to an empty page with a blinking cursor in my browser.

@GameTec-live
Copy link
Author

GameTec-live commented Jan 4, 2024

@GameTec-live The hastebin link at the end of your description leads to an empty page with a blinking cursor in my browser.

oh, weird... ill reupload later... (when I'm home)

@GameTec-live
Copy link
Author

crashlog.log
Sorry for the late reply, but here you go...

@robertlipe
Copy link
Contributor

robertlipe commented Jan 4, 2024 via email

@GameTec-live
Copy link
Author

GameTec-live commented Jan 4, 2024

changed build_type under base from release to debug and set monitor_filters = esp32_exception_decoder under [env:demo]

heres the new log:
debug-log.log

@robertlipe
Copy link
Contributor

robertlipe commented Jan 4, 2024 via email

@GameTec-live
Copy link
Author

probably the stupidiest thing youve heard in a while, but.... i do have a pico debug probe (SWD) and cant figure out where im supposed to hook it up...

@robertlipe
Copy link
Contributor

robertlipe commented Jan 5, 2024 via email

@GameTec-live
Copy link
Author

Okay, thanks for the help, i managed to get a debugger running (it was a driver issue...)... I don't know what the value of mCur is supposed to be, etc... I'll poke around a bit more and report back...

@GameTec-live
Copy link
Author

cant really see anything unusal? (i mean; i also dont know what those values are supposed to be lol)
mPixelData seems to be large or even "infinite" as i can request a lot more than 1500 array entries with the debugger... mCur also goes past 1500...

@rbergen
Copy link
Collaborator

rbergen commented Jan 7, 2024

@GameTec-live Glad to see you got the debugger working, and I appreciate the earlier upload of the debug log. As @robertlipe already indicated, it does show that the actual problem (which is a form of illegal memory access) takes place several levels below "our" code. In fact, the backtrace doesn't even include references to any code that's part of NightDriverStrip.

Concerning your last comment, we're obviously not looking over your shoulder, so we can't see what you are looking at. Also, even if we could then figuring out what the cause of the invalid behavior is would effectively require us to debug the dependency libraries involved. Which isn't entirely impossible, but very difficult if we can't ourselves debug trace through the code just before the problem occurs.

Without having the hardware and software setup that triggers these crashes available, I therefore think we won't be able to solve this. You could (still) consider raising a bug report in the dependent libraries (Espressif ESP-IDF and/or FastLED) and see if they are able to provide pointers to what's actually behind this.

I'll leave this issue open in case someone else runs into the same problem, and may be able to provide additional information that can help get to the root of this.

@GameTec-live
Copy link
Author

Yeah, kinda hard to replicate a issue without hardware... Ill open a issue over at FastLED... Thanks for the help though...

@prschguy1
Copy link

hey GameTec-live
While i am not one of the bigger brains on this repository, perhaps I can provide some help to the problems you are experiencing.
here. My understanding is that demo is intended for strip effects only. I was a bit surprised that you tried putting spectrum into a demo build. have never tried that, but assumed it wouldn't work as newer spectrum builds use dma. .Instead of using demo as your build, might suggest using spectrum-elecrow, I have that working properly on the chip you specify. it looks good with pdm mic, remote, display, and all the effects. when I run it with strip effects, I run out of memory, but runs well on spectrum. might give it a try. Memory usage looks pretty good here. just my 2 cents.
Capture

20240108_191857.mp4

@GameTec-live
Copy link
Author

hey GameTec-live
While i am not one of the bigger brains on this repository, perhaps I can provide some help to the problems you are experiencing.
here. My understanding is that demo is intended for strip effects only. I was a bit surprised that you tried putting spectrum into a demo build. have never tried that, but assumed it wouldn't work as newer spectrum builds use dma. .Instead of using demo as your build, might suggest using spectrum-elecrow, I have that working properly on the chip you specify. it looks good with pdm mic, remote, display, and all the effects. when I run it with strip effects, I run out of memory, but runs well on spectrum. might give it a try. Memory usage looks pretty good here. just my 2 cents.
Capture

20240108_191857.mp4

thx for the info, ig ill try that... My matrix is just one long strip though...

@GameTec-live
Copy link
Author

altough using the spectrum-elecrow project seems to work, it doesnt help me much as spectrum drives hub whatever its called matrixes and i just have one long strip snaking back and forth...

@prschguy1
Copy link

All of the spectrum builds use a series of strips that zig-zag back and forth as you describe. They can be identified as there is only 1 pin used for output led_pino. The hub75 is currently only used for the Mesmerizer builds that require around 14 output pins depending on what you are doing. Dave does a great job of describing all of this here:

https://www.youtube.com/watch?v=COJnlehBcKw&t=224s

The video I posted is using a standard zig-zag strip as you describe at 16 pixels high by 48 pixels wide on spectrum build, using the chip you specified.

@GameTec-live
Copy link
Author

ah, ok... so ig I will fumble a bit more with spectrum elecrow and try to use that... thanks...

@GameTec-live
Copy link
Author

GameTec-live commented Jan 12, 2024

nvm, setting the matrix to the required size crashes elecrow too...

@prschguy1
Copy link

Perhaps if you could articulate exactly what you are trying to accomplish, we might be able to help. As far as I am seeing, this chip is working without any crashes for the spectrum build. While there aren't a lot of spectrum effects here, intend to test it more thoroughly.

@GameTec-live
Copy link
Author

Perhaps if you could articulate exactly what you are trying to accomplish, we might be able to help. As far as I am seeing, this chip is working without any crashes for the spectrum build. While there aren't a lot of spectrum effects here, intend to test it more thoroughly.

drive my 50x30 matrix (being one long strip) with the new, more powerful ESP32-S3... It works fine with my current, not as powerful ESP32 (afaik its even singlecore?), had to disable nice to haves like the webserver though for it to run a stream from the computer at a decent framerate which im hoping to fix with this a lot more powerful variant...

@GameTec-live
Copy link
Author

Stupid question, but can some of the devs or someone more competent than me try and compile demo with a 50x30 matrix?
I tried to compile elecrow, 50x30, similar error. Tried to compile demo for a seeed studio esp32 c3 (ik, not officially supported) and it's still the same error (from what it looks like)
Havnt tried a nodemcu (clone) yet as thats currently driving the matrix and id rather not break it until my replacement mcu works... :/

@robertlipe
Copy link
Contributor

robertlipe commented Feb 12, 2024 via email

@GameTec-live
Copy link
Author

GameTec-live commented Feb 12, 2024

Ok, that makes sense then, was just the one I had laying around... Id still be interested if someone else can replicate this issue or if its just me or maybe even a defect MCU... And having a reproducable thing / minimum reproducable example might help speed things up here (or over at fastLED)

@prschguy1
Copy link

Hello [GameTec-live], have looked at your situation a good bit over the last couple of weeks, and do get similar results. While I can approach your matrix size, cannot quite get there. Have a similar problem when trying to use this chip with 4 channel strip effects. Have tried different board define files as well as different memory tables, but still have not solved this problem. While an alternate led program makes both of our build problems work, I would like to get it working here. What I have learned is programs can make use of 16 mb, and the 32mb that these chips support is only useful for storage. Have tried both of our builds with unexpected maker s3 pro, wemos s3, and generic and official builds of esp32-s3 in various memories. Have a wemos d32 pro, and m5stack I'll try our build on next. M5 stack used to work on my build, but suspect that no longer works. Will let you know what I find.

@GameTec-live
Copy link
Author

Ah, so it isn't just me XD
Thanks for trying to help though.

@GameTec-live
Copy link
Author

@robertlipe while ordering other stuff, i threw in a N16R8, so the 16MB version you apperently use... I still get the same panic...

@robertlipe
Copy link
Contributor

robertlipe commented May 15, 2024 via email

@davepl
Copy link
Contributor

davepl commented May 15, 2024 via email

@rbergen
Copy link
Collaborator

rbergen commented May 15, 2024

Should we maybe include the fix for this in #626?

I believe it is as simple as not touching the on-board LED. We could achieve that by just commenting out the respective #ifdef ESP32FEATHERTFT block in globals.h.

@davepl
Copy link
Contributor

davepl commented May 15, 2024 via email

@rbergen
Copy link
Collaborator

rbergen commented May 15, 2024

I think the on-board LED code is only activated if ONBOARD_PIXEL_POWER is defined.

@robertlipe
Copy link
Contributor

robertlipe commented May 15, 2024 via email

@GameTec-live
Copy link
Author

GameTec-live commented May 19, 2024

So, it seems like #626 is supposed to contain a fix for this? I pulled down that branch/fork, compiled it, nothing, still panics.
For clarity ive now generated git diffs of the exact changes i made and once again uploaded the logs. (For both of my controllers)
(the N16R8 log is a bit wierd though, as it doesnt spit out the core dump? It did before, with the same illegal cache access, but when capturing the log it just didnt...)
N16R8-crashlog.txt
N32R8V-diff.patch
N32R8V-crashlog.txt
N16R8-diff.patch

Edit: ofcourse the secrets.h isnt included, but its just a filled out template with hostname and WIFI credentials, etc...

@rbergen
Copy link
Collaborator

rbergen commented May 19, 2024

No, this is a different problem than the one the latest change in #626 is trying to address. That fixes random crashes without any panic logging on the S3 with the Feather project "as standard", after the device has been running for a while.

Your controllers seem to consistently crash as soon as WiFi tries to connect, with PSRAM enabled. I think I remember Dave configuring PSRAM to be off on all devices except Mesmerizer because he was seeing exactly this behavior.

@GameTec-live
Copy link
Author

Ah, ok... Well, i tried leaving the psram off (not adding the build flag) too, but it didnt work either, i can try again later and send some logs for that too...

@rbergen
Copy link
Collaborator

rbergen commented May 19, 2024

Then it seems we have 3 different problems - which I wouldn't find surprising at all.

(I know that observation adds absolutely nothing towards a solution, but it's all I can conclude at this point in time...)

@GameTec-live
Copy link
Author

GameTec-live commented May 19, 2024

Well, observation is usually the first step to figuring stuff out? XD

Anyways, enjoy the other 2 log files and diffs of the 2 PSRAM less builds:

N32R8V-noPSRAM-diff.patch
N32R8V-noPSRAM-crashlog.txt
N16R8-noPSRAM-diff.patch
N16R8-noPSRAM-crashlog.txt

Edit: Whoops, that one log was connected to the wrong port, no wonder that its so short, heres the longer one:
N16R8-noPSRAM-crashlog.txt

@robertlipe
Copy link
Contributor

robertlipe commented May 19, 2024 via email

@davepl
Copy link
Contributor

davepl commented May 19, 2024 via email

@rbergen
Copy link
Collaborator

rbergen commented May 20, 2024

Agreed. It looks like it IS a different problem. So we have Dave's problem of treating PSRAM pins as LEDs. We have Gametec's issue of the crash inside FastLED when "lots of LEDs" are used. (Is network traffic a key?) What's the third? Sorry. My on-board CPU is running slow today.

Problem number 3, as I remember it(!), is S3 boards crashing at WiFi connect when PSRAM is enabled. Which I may have misremembered, in which case there is no problem number 3.

@GameTec-live
Copy link
Author

@rbergen you seem to be misremembering, as atleast for me on my 2 MCUs, with PSRAM enabled, driving a smaller matrix (eg 10x10) boots up perfectly, etc...
For completeness have the logs and git diffs anyways:
N16R8-smallWorking-diff.patch
N32R8V-smallWorking-diff.patch
N32R8V-smallWorking-log.txt
N16R8-smallWorking-log.txt

@rbergen
Copy link
Collaborator

rbergen commented May 20, 2024

No, I'm not - or at least not in that way. The point is that there are now 3 problems with (certain) S3 boards we know of. Your problem - that being the one this issue concerns primarily - is the second problem in @robertlipe's summary. There are two others, though. One is the first Robert mentions, the other the one I mentioned in my previous comment.

@robertlipe
Copy link
Contributor

robertlipe commented May 20, 2024 via email

@rbergen
Copy link
Collaborator

rbergen commented May 20, 2024

That means we have only Gametec's still live and in play, right?

Based on what you say in the rest of your comment, that's quite likely true.

If we track problems we used to have, I'm sure we'll all lose our minds even more quickly.

That may be true for some, many or most, but not me. I need to keep some record of problems we used to have to retain my sanity. If only because some "solved" problems have the unpleasant habit of rearing their heads again sometime later.

But, I can be quiet about it if that's generally preferred. :)

@davepl
Copy link
Contributor

davepl commented May 20, 2024 via email

@robertlipe
Copy link
Contributor

robertlipe commented May 20, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants