Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTGO doesn't come up when flashed using Web Installer #367

Open
sdmtr opened this issue Jul 19, 2023 · 15 comments
Open

TTGO doesn't come up when flashed using Web Installer #367

sdmtr opened this issue Jul 19, 2023 · 15 comments

Comments

@sdmtr
Copy link

sdmtr commented Jul 19, 2023

Bug report

Problem
Howdy! I've spent the last few hours trying to flash all manner of ESP32 devices with any and all versions of NDL available via the web installer, and each time I'm unable to set the wifi credentials. As an example, if I install the TTGO firmware to a Lilygo TTGO board then once the install is complete, I'm simply dumped back at the initial screen where my only two options are to install NDL or view the console. I did manage to get the wifi setup process to begin once by carefully timing when I clicked on the "connect" button (more about that below), but the process stalled when it reached the part where it searches for available wifi networks.

Steps

  1. Attempt to install NDL
  2. Observe your inability to set wifi credentials

Notes
When I look at the console output, I can see that NDL attempts to retrieve wifi credentials from flash memory, notices that none are set, and therefore reboots. I believe (although I could be totally wrong) that this is where the problem lies, because the Improv module simply doesn't have enough time to connect to the board, retrieve the list of available SSIDs, and receive the credentials from the user, before the device reboots and the connection is dropped. Here's an excerpt of the console logs:

(W) (DrawLoopTaskEntry)(C1) Entering main draw loop!
(I) (setup)(C1) Calling ConnectToWifi()
(I) 
(I) (ConnectToWiFi)(C1) Setting host name to NightDriverStrip...WL_NO_SHIELD
(W) (ConnectToWiFi)(C1) WiFi Credentials not set, cannot connect
(I) (setup)(C1) Unable to connect to WiFi, but must have it, so rebooting...
(I) 
(E) (TerminateHandler)(C1) -------------------------------------------------------------------------------------
(E) (TerminateHandler)(C1) - NightDriverStrip Guru Meditation                              Unhandled Exception -
(E) (TerminateHandler)(C1) -------------------------------------------------------------------------------------
(I) (PrintOutputHeader)(C1) NightDriverStrip
(I) 
(I) (PrintOutputHeader)(C1) ------------------------------------------------------------------------------------------------------------
(I) (PrintOutputHeader)(C1) M5STICKC: 0, USE_M5DISPLAY: 0, USE_OLED: 0, USE_TFTSPI: 0, USE_LCD: 0, USE_AUDIO: 0, ENABLE_REMOTE: 0
(I) (PrintOutputHeader)(C1) Version 37: Wifi SSID: "" - ESP32 Free Memory: 171920, PSRAM:0, PSRAM Free: 0
(I) (PrintOutputHeader)(C1) ESP32 Clock Freq : 240 MHz
(E) (TerminateHandler)(C1) Terminated due to exception: Unable to connect to WiFi, but must have it, so rebooting

As mentioned above, the one time I was able to see an option to set the wifi credentials didn't come at the end of an installation, it came when I timed clicking the "connect" button such that the web installer connected to the board during a moment of time when the Improv serial module was up and responding. As far as I can tell, the installer makes an attempt to connect via Improv at the beginning of the session, and if it succeeds then it'll read the board settings (firmware version and hardware type) and display the option to set wifi credentials. This is why I think it's a timing issue and that the forced reboot is what's causing the problem in the first place.

Proposed Solution
Don't force a reboot when wifi isn't available. As per the final line in the log above, it seems that NDL restarts the board as soon as it's unable to connect to wifi, either because credentials aren't set or the network isn't available. If that happens then there's no opportunity for the web installer to connect to the board via Improv and set the credentials, so the board is rendered useless.


(Also, I just want to say how excited I am about this project and how utterly cool it is. I've used WLED for a LOT of stuff over the last few years but it has a few idiosyncrasies that I don't love, and NDL looks like it's shaping up to be an incredible replacement going forward. I can't wait to get my hands on a Mesermerizer board and really see what it can do. Thank you so much for making this project available to us mere mortals, Dave.)

@robertlipe
Copy link
Contributor

robertlipe commented Jul 19, 2023 via email

@robertlipe
Copy link
Contributor

robertlipe commented Jul 19, 2023 via email

@rbergen
Copy link
Collaborator

rbergen commented Jul 19, 2023

There may be a "cleaner" way to trigger reboots on ESP32s, but the approach taken in this project is indeed to throw an std::runtime_error, which will then trigger a reboot. The last line of the log snippet posted by @sdmtr actually kind of illustrates this:

(E) (TerminateHandler)(C1) Terminated due to exception: Unable to connect to WiFi, but must have it, so rebooting

This reboot happening makes sense if the build has been configured to require WiFi. That is the case if both ENABLE_WIFI and WAIT_FOR_WIFI are defined as non-zero. In the "regular" project configurations as they stand, this is only the case for the LEDSTRIP project. That is obviously not the same as the TTGO project.

So:

  1. I'm not sure how one would end up in the code path that triggers the reboot if WiFi doesn't come up, if one was flashing the TTGO project.
  2. The case made in this issue is relevant for the LEDSTRIP project, and I'll give that some thought.

@robertlipe
Copy link
Contributor

robertlipe commented Jul 19, 2023 via email

@rbergen
Copy link
Collaborator

rbergen commented Jul 19, 2023

ESP.restart();

Yes, it had to be something as straight-forward like that, didn't it? 🙂

I think what I mentioned as "current project MO" is a case of semi-consistently applied legacy code. Which I'll instantly grant can be replaced by more modern/now recommended approaches. However, personally I'm not going to implement that change in the context of this issue. If someone else wants to open a PR to do so, I'm very happy to review it.

@davepl
Copy link
Contributor

davepl commented Jul 19, 2023 via email

@robertlipe
Copy link
Contributor

robertlipe commented Jul 19, 2023 via email

@sdmtr
Copy link
Author

sdmtr commented Jul 20, 2023

What you're seeing isn't a controlled "hey, let's reboot now". That just a plain ole crash.

It's a crash only insofar as it's the result of an unhandled exception, but the exception in question is caused by the lack of wifi credentials, which there's no opportunity to provide. See the last line of the log excerpt:

(E) (TerminateHandler)(C1) Terminated due to exception: Unable to connect to WiFi, but must have it, so rebooting

FWIW, if it halves your testing matrix, you can just whack the "USE_NETWORK' in the configuration when building firmware to see if that's a key variable.

This issue is specific to the web installer, I'm not building from source. If I were then this wouldn't be a problem, the correct credentials would have been set in secrets.h, but this bug report is explicitly about the web installer and proposing a way to fix it so that other people don't run into this same problem when they're trying to get their boards up and running. The vast majority of people who will end up using NDL won't be building from source; as with WLED it'll be people who just want to click a button on a web installer to load a pre-compiled binary onto their ESP32 that they can immediately start using.

This reboot happening makes sense if the build has been configured to require WiFi. That is the case if both ENABLE_WIFI and WAIT_FOR_WIFI are defined as non-zero. In the "regular" project configurations as they stand, this is only the case for the LEDSTRIP project. That is obviously not the same as the TTGO project.

I gave the TTGO project just as an example, this happens for every board I've tested with and every version of the firmware available via the web installer that uses wifi. The issue isn't that the reboot is happening per se, that's intentional behaviour due to the unhandled exception being thrown; the issue is that if the board immediately reboots when wifi credentials aren't available then logically there will never be an opportunity to actually provide those credentials, making web installations effectively useless for anything other than projects that don't use wifi at all.

Is there a scenario in which it doesn’t restart?

No, all combinations I've tried enter this reboot loop as soon as it notices there are no wifi credentials available, making it effectively impossible to send it the credentials since the Improv service isn't alive long enough to communicate with the web installer. This seems to be intended behaviour, NDL intentionally throws an exception when it can't connect to wifi, and because that exception is unhandled the correct behaviour is to restart.

If the default behavior on an unhandled exception is to restart, then I think what we’re doing now is pretty clean.

Absolutely, rebooting on an unhandled exception is perfectly fine, the issue here though is that I don't believe this should be an unhandled exception in the first place - after all, how can Improv connect to the board to deliver the wifi credentials it needs if the firmware reboots the moment it discovers it has no credentials?

A better way to handle it might be to simply wait for credentials to be delivered, just sit in an idle loop until Improv receives the correct RPC via serial. Otherwise I can't see how it would ever be possible to use the web installer to successfully load NDL onto a board, the only possible way to do it would be to build NDL from source with credentials preloaded in secrets.h.

I should note that WLED also uses Improv in the same way you're using it, and their solution is to just continue booting up like normal and display whatever the default strip pattern is, while also exposing an access point that the user can connect to in order to use the web interface and continuing to listen for Improv RPCs over the serial port.

@rbergen
Copy link
Collaborator

rbergen commented Jul 20, 2023

I gave the TTGO project just as an example, this happens for every board I've tested with and every version of the firmware available via the web installer that uses wifi. The issue isn't that the reboot is happening per se, that's intentional behaviour due to the unhandled exception being thrown; the issue is that if the board immediately reboots when wifi credentials aren't available then logically there will never be an opportunity to actually provide those credentials, making web installations effectively useless for anything other than projects that don't use wifi at all.

@sdmtr What I'm saying is two things:

  1. The board rebooting because of a) no credentials being available and then b) instantly giving up on WiFi connectivity - and in fact, the whole boot - is counterproductive if there is still a path to getting the credentials. As there is in the web installer scenario. That's something I think we should indeed (re)consider.
  2. The board should not reboot for all projects that use WiFi. As I said in my previous comment, the one code path that leads to the reboot you provided logging of is only triggered (or actually, compiled in) when WiFi is enabled (ENABLE_WIFI is 1) and WAIT_FOR_WIFI is 1. This is handled by lines 629 to 637 in main.cpp.
    In the current project configurations, WAIT_FOR_WIFI is only defined to be 1 for the LEDSTRIP project, so that's the only project for which the WiFi-related reboots should happen.

@davepl About WAIT_FOR_WIFI, I basically see two options:

  • Dropping it altogether, meaning we never reboot if WiFi fails.
  • Always setting it to 0 when CI builds the web installer. This is something I could quite easily "sed" my way through in the CI workflow file.

As I'm not sure why the "reboot if WiFi connectivity fails" behavior was originally implemented, I was wondering if you could give some input on this?

@sdmtr
Copy link
Author

sdmtr commented Jul 20, 2023

@rbergen RE your second point, you're absolutely right. My original comment mixes observations and logs from many different tests across different boards rather than focusing on one specific test, which has created some confusion and lead to some inaccuracies on my part, sorry about that.

I just did another handful of quick tests using just the TTGO board, and the ledstrip firmware is indeed the one that reboots when wifi credentials aren't present (as you correctly pointed out.) The TTGO firmware on the other hand seems to be experiencing a different problem, although I'm not sure what exactly. Here's the full log:

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:13192
load:0x40080400,len:3028
entry 0x400805e4
E (655) esp_core_dump_flash: No core duy��partition found!
E (655) esp_core_dump_flash: No core dump partitionReplacing Idle Tasks with TaskManager...
(I) (PrintOutputHeader)(C1) NightDriverStrip
(I) 
(I) (PrintOutputHeader)(C1) ------------------------------------------------------------------------------------------------------------
(I) (PrintOutputHeader)(C1) M5STICKC: 0, USE_M5DISPLAY: 0, USE_OLED: 0, USE_TFTSPI: 1, USE_LCD: 0, USE_AUDIO: 1, ENABLE_REMOTE: 1
(I) (PrintOutputHeader)(C1) Version 37: Wifi SSID: "" - ESP32 Free Memory: 256532, PSRAM:0, PSRAM Free: 0
(I) (PrintOutputHeader)(C1) ESP32 Clock Freq : 240 MHz
(I) (setup)(C1) Startup!
(I) (setup)(C1) Starting DebugLoopTaskEntry
>> Launching Debug Thread.  Mem: 256532, LargestBlk: 110580, PSRAM Free: 0/0, >> Launching JSON Writer Thread.  Mem: 253560, LargestBlk: 110580, PSRAM Free: 0/0, (W) (DeviceConfig)(C1) DeviceConfig could not be loaded from JSON, using defaults
(W) (NotifyJSONWriterThread)(C1) >> Notifying JSON Writer Thread
(W) (setup)(C1) Starting ImprovSerial
(W) (ReadWiFiConfig)(C1) Retrieved SSID and Password from NVS: , ********
E (588) gpio: GPIO can only be used as input mode
[   599][E][esp32-hal-gpio.c:130] __pinMode(): GPIO config failed
E (592) gpio: gpio_set_level(226): GPIO output gpio_num error
E (597) gpio: GPIO can only be used as input mode
[   612][E][esp32-hal-gpio.c:130] __pinMode(): GPIO config failed
E (607) gpio: gpio_set_level(226): GPIO output gpio_num error
(W) (setup)(C1) Creating TFT Screen
(W) (setup)(C1) Allocating LEDStripGFX for channel 0
(I) (setup)(C1) Could allocate 26 buffers but limiting it to 20
(I) 
(W) (setup)(C1) Reserving 20 LED buffers for a total of 46720 bytes...
(I) (setup)(C1) Adding LEDs to FastLED...
(I) (setup)(C1) Adding 768 LEDs to FastLED.
(W) (InitEffectsManager)(C1) InitEffectsManager...
(I) (InitEffectsManager)(C1) Creating EffectManager using default effects
>> Launching Drawing Thread.  Mem: 193960, LargestBlk: 110580, PSRAM Free: 0/0, (W) (DrawLoopTaskEntry)(C1) >> DrawLoopTaskEntry
(W) 
(W) (DrawLoopTaskEntry)(C1) Entering main draw loop!
>> Launching Screen Thread.  Mem: 186808, LargestBlk: 110580, PSRAM Free: 0/0, >> Launching Audio Thread.  Mem: 183872, LargestBlk: 110580, PSRAM Free: 0/0, >> Launching Remote Thread.  Mem: 179132, LargestBlk: 110580, PSRAM Free: 0/0, (I) (AudioSamplerTaskEntry)(C0) >>> Sampler Task Started
(W) (begin)(C1) Remote Control Decoding Started
>> Launching Network Thread.  Mem: 174528, LargestBlk: 110580, PSRAM Free: 0/0, [  1720][E][ESPmDNS.cpp:65] begin(): Failed starting MDNS
Error starting mDNS
>> Launching ColorData Thread.  Mem: 169540, LargestBlk: 110580, PSRAM Free: 0/0, (E) (TerminateHandler)(C0) -------------------------------------------------------------------------------------
(E) (TerminateHandler)(C0) - NightDriverStrip Guru Meditation                              Unhandled Exception -
(E) (TerminateHandler)(C0) -------------------------------------------------------------------------------------
(I) (PrintOutputHeader)(C0) NightDriverStrip
(I) 
(I) (PrintOutputHeader)(C0) ------------------------------------------------------------------------------------------------------------
>> Launching Socket Thread.  Mem: 166476, LargestBlk: 110580, PSRAM Free: 0/0, (I) (PrintOutputHeader)(C0) M5STICKC: 0, USE_M5DISPLAY: 0, USE_OLED: 0, USE_TFTSPI: 1, USE_LCD: 0, USE_AUDIO: 1, ENABLE_REMOTE: 1
(I) (PrintOutputHeader)(C0) Version 37: Wifi SSID: "" - ESP32 Free Memory: 162132, PSRAM:0, PSRAM Free: 0
(I) (PrintOutputHeader)(C0) ESP32 Clock Freq : 240 MHz
I) (PrintOutputHeader)(C0) ESP32 Clock Freq : 240 MHz
 Free Mem
abort() was called at PC 0x4018f1b4 on core 0


Backtrace: 0x4008512d:0x3ffda7a0 0x40090169:0x3ffda7c0 0x40095da1:0x3ffda7e0 0x4018f1b4:0x3ffda860 0x4018f206:0x3ffda880 0x4018f59b:0x3ffda8a0 0x400828cd:0x3ffda8c0 0x40082965:0x3ffda8e0




ELF file SHA256: 8b20cee5cd509615

E (2306) esp_core_dump_flash: Core dump flash config is corrupted! CRC=0x7bd5c66f instead of 0x0
Rebooting...

For the sake of clarity, this is an authentic Lilygo TTGO ESP32-DOWDQ6 board, and I selected "ESP32" as the device type and "TTGO" as the project in the web installer interface.

@rbergen
Copy link
Collaborator

rbergen commented Jul 20, 2023

Thanks @sdmtr for clearing this up. It does help focus the analysis of the problems (now plural) we are investigating. Putting the no-WiFi reboot aside for now - I think we now know what we're looking at there - I'd say the logging on the TTGO crash doesn't provide too many insights as to what's going on. The "abort()" mention at the bottom of the log doesn't help much either, as the C++ code in the project doesn't call any function by that name.

I don't own any TTGO boards myself, so I can't compare what you're seeing to anything useful at my end - maybe @davepl can. A question I do have is if you've tried flashing the board using the PlatformIO route? I know the issue relates to the web installer specifically, but trying to flash the board the other way may well narrow the area that needs to be covered while investigating this.

@davepl
Copy link
Contributor

davepl commented Jul 20, 2023 via email

@rbergen
Copy link
Collaborator

rbergen commented Jul 20, 2023

My question comes from the fact that we only treat WiFi not connecting as an exception worthy of rebooting in the LEDSTRIP project, not any of the others. In all other projects, we continue trying to connect in the main.cpp loop() every so many seconds.

Rebooting immediately after establishing that no credentials are present, as LEDSTRIP does, keeps the user from providing credentials via Improv. That means that for LEDSTRIP, the correct credentials have to be embedded into the image (i.e. secrets.h) for the image to work.

@davepl
Copy link
Contributor

davepl commented Jul 20, 2023 via email

@rbergen rbergen changed the title Board reboots before Web Installer can set Wi-Fi credentials TTGO doesn't come up when flashed using Web Installer Jul 22, 2023
@rbergen
Copy link
Collaborator

rbergen commented Jul 22, 2023

I've opened #371 for the LEDSTRIP reboot issue, and renamed this one to focus on TTGO failing to come up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants