Modular Clock Rev 4 Fails on Reset

More
2 months 2 weeks ago #12809 by mgerde
I had just completed this build. Tubes all seemed to go through the test routine and HV calibration properly. After doing the initial setup I did get it working, displaying the correct time, but I needed to power off to move it and start working on an enclosure. When I powered it on later, I immediately noticed problems. No smoke, nothing hot to the touch, just misbehaving. The tubes are blank but the neon separators are constant on. Usually the LEDs turn red, though I have seen the LEDs stay off as in the picture I provided. Visually checked the assembly and I don't see anything obviously wrong. I had bought 2 of these kits at the same time and I've tried all combinations of the 2 ESP8266 and 2 Atmega controllers that I was shipped with same results each time.

I can get it working again, temporarily. It seems if I factory reset the clock controller and (if I can get it to connect to wifi) reset the wifi and clock configs on the wifi controller, then go through normal setup procedure again, it will work. But, it will do the same thing again when I power cycle it.

Here is a video of the clock functioning normally after setup and failing on power cycle.
Video

If I remove the wifi module, the clock will run as expected but with default settings and the time set as default in the firmware. Replacing the wifi module will bring back the malfunction.

I was able to get the status during that error state once when the wifi module connected to the network:

WLAN IP 192.168.1.143
WLAN MAC E8:DB:84:DA:3A:42
WLAN SSID UniFi_Legacy
NTP Pool pool.ntp.org
TZ CST6CDT,M3.2.0,M11.1.0
Clock Name ESP-DA3A42 ( esp-da3a42.local )
Last NTP time 2021:11:06 22:06:22
Last NTP update 7 s ago
Time before next NTP update 2 h 53 s
Display Time 2021:11:06 22:06:29
Uptime 12 s
Version vx58c
Serial Number 000979
Status string WNSUAd
Total clock on time 7 m 0 s
Communicating with: Clock not found!
ESP8266 information
Name Value
Sketch compiled Oct 14 2020 21:08:01
Sketch size 452160
Free sketch size 507904
Free heap 23736
Boot version 31
CPU Freqency (MHz) 160
SDK version 2.2.2-dev(38a443e)
Chip ID 14301762
Flash Chip ID 1440d8
Flash size 1048576
Last reset reason Power On
Vcc 3.35
LED_BUILTIN / Used 2/1


I noticed there was no I2C slave listed.

This is the status when I can get it working after resetting everything:

WLAN IP 192.168.1.143
WLAN MAC E8:DB:84:DA:3A:42
WLAN SSID UniFi_Legacy
NTP Pool pool.ntp.org
TZ CST6CDT,M3.2.0,M11.1.0
Clock Name ESP-DA3A42 ( esp-da3a42.local )
Last NTP time 2021:11:05 16:54:22
Last NTP update 1 m 23 s ago
Time before next NTP update 1 h 59 m 38 s
Display Time 2021:11:05 16:55:45
Uptime 4 m 17 s
Version vx58c
Serial Number 000979
Found I2C slave (Nixie clock) (ping, preferred) 105
Status string WNSUAd
Total clock on time 6 m 0 s
Communicating with: NixieFirmwareV2, I2C v62
ESP8266 information
Name Value
Sketch compiled Oct 14 2020 21:08:01
Sketch size 452160
Free sketch size 507904
Free heap 21752
Boot version 31
CPU Freqency (MHz) 160
SDK version 2.2.2-dev(38a443e)
Chip ID 14301762
Flash Chip ID 1440d8
Flash size 1048576
Last reset reason Software/System restart
Vcc 3.35
LED_BUILTIN / Used 2/1


When the clock does a normal startup it displays version 3.56.

I probed with my multimeter looking at the diagrams but I have not figured out anything concrete as a cause. This is what I have seen during the error state:

HV is at 210 volts, pin 23 of the controller is at 2.6 volts.
SCL reads at 5 volts but SDA is at 0 volts. VCC is 5 volts on the main board and 3.3 on the wifi module. (Pins 3 and 5 on the wifi module read 3.3 and 0 volts respectively, seeming to correlate with the SCL/SDA pads on the main board).

At this point I am stuck and could really use some help fixing or diagnosing the problem. Thank you!
Attachments:

Please Log in or Create an account to join the conversation.

More
2 months 2 weeks ago #12810 by Ty_Eeberfest
Thanks for the detailed and coherent problem description! I don't have an answer for you (yet) but I have a few observations and some questions.

The controller and WiFi module firmware are not the very latest. However, it looks to me like they should play nicely together. If you have the equipment to flash new firmware and feel like messing with it, get the latest here:
CLOCK FIRMWARE
WIFI FIRMWARE
I don't think flashing firmware is likely to fix anything but I'm mentioning it here as an option anyway.

The HV generator is continuing to run when the clock is not working, as evidenced by the neon separators being lit. It seems the controller is in a partially crashed state where the program is not being executed but hardware timer/counter unit continues to function.

Somebody is holding the SDA line low, which is not the normal idle state of I2C. Idle state is both signals tri-stated by all devices allowing the pull-up resistors to pull both lines high. SCL high + SDA low implies that somebody is issuing a Start and getting stuck there. Would be interesting to know the state of SDA with the WiFi module unplugged.

Key differences between starting from normal power-up and starting from a reset: starting from reset resets a lot of values in eeprom to defaults and runs the HV calibration, otherwise it's all the same code either way. Not seeing where either eeprom resets or HV calibration should cause the problem you are seeing.

When you start with a reset and connect to the WiFi module, can you successfully change settings that should be seen immediately on the clock (e.g. change 24/12 hour mode or change the backlight LEDs behavior) from the WiFi interface? This is a better test of I2C comms than just setting the time server and seeing the time get corrected.

Have you tried powering up with reset then powering down while the clock is counting and flashing LEDs (before doing the HV calibration) then powering right back up? If so what happens?

That's all I have for now. Ian may be able to shed some more light on the situation especially as relates to firmware versions.

Look into it later when the dust is clearing off the crater.

Please Log in or Create an account to join the conversation.

More
2 months 2 weeks ago #12811 by Ian
So, to make sure I have the situation absolutely clear: everything works fine without the Wifi module installed, including restarts? As soon as you put the WiFi module in, it starts to behave as in the video?

The HV voltage in the error state is interesting. As Ty says, it indicates that the HV counter/timer is running, but that it is not regulating any more. Normally what should happen is that the HV generator runs only when digits are being displayed, otherwise the voltage creeps up, exactly as you are seeing.

I also think there is something going on with the I2C signals causing the controller to crash. And am also stumped right now.

If you put it into the crashed state and then pull the Wifi module without restarting the clock, does that do anything?

Please Log in or Create an account to join the conversation.

More
2 months 2 weeks ago #12812 by mgerde
It may take me a few days to get it done but I will try flashing each board. I do have an UNO I can use and also a programming board that is for ESP-01S boards (i'm hoping it will program the wifi board from the kit as well which looks like ESP-01).

If I remove the wifi board and reset without it, SCL and SDA each read at 5 volts on the multimeter. If I reproduce the error state and pull the wifi board while it's running, nothing changes. The error state continues and the voltage readings do not change, with SDA in a low state. If I reset/power cycle with the wifi board already off, the clock will start and run with the default time setting.

The few times after fully resetting both boards that I was able to get things working, changes made on the webpage would go through to the clock normally and took effect immediately. So the I2C bus does work at some point, until power cycle.

Here are 2 videos of me going through the whole process. I start in the error state then pull the wifi board then power cycle. The clock runs with default time. I reset the clock controller via button. I can power cycle and it will still run in first boot mode even if wifi board is on. I then go through HV calibration. The result of HV calibration always is very dim. I then show how the wifi controller is connected but changes through the web interface don't take effect. I fully reset the wifi controller and after reconnecting it to my network the two seem to be working normally and changes take effect immediately. I have confirmed both SCL and SDA read 5 volts at that time. The error state is produced again after power cycle, interestingly without the LEDs on in this case, only the separators.

Video 1 - Error to reset
Video 2 - Reset to working then back to error

I will let you know my results after flashing the boards.

Thank you both!

Please Log in or Create an account to join the conversation.

More
2 months 2 weeks ago #12813 by Ian
ESP-01S?

There's a problem with the ESP-01S, that the LED has been put onto a different pin than the ESP-01 (not S).

Please check which one you are using. If it is the S version, we may have found our problem.

The ESP-01S has the LED on the same pin as is used for the SDA line (IIRC) and when we start flashing the LED in the code, this disrupts the controller on the clock!

Before we go and reflash any of the clock controllers, let's finish this part of the investigation!

Please Log in or Create an account to join the conversation.

More
2 months 2 weeks ago - 2 months 2 weeks ago #12814 by Ty_Eeberfest
Ian, I think he got the whole kit, including WiFi Module, from you. Are you shipping any ESP-01S modules?
Mgerde, correct me if I'm wrong about the source of your WiFi Module.

The HV Calibration in Video 1 looked weird to me. Stopping on 246 with no down-counting is not what I'm accustomed to seeing. It looks like when you turn off LDR the brightness of the tubes is good. True? This may be nothing but a red herring but I figure I should mention what I see.

I checked the SDA and SCL lines with a meter on a working Modular Rev. 3.1 (don't have a Rev. 4 here, shouldn't matter) and as expected both lines read +5V.
EDIT: Actually I read +5V on the SDA and SCL pins of the clock controller and +3.3V on the corresponding pins of the WiFi module.

Since SDA is low when in the error state, even with the WiFi module removed, it obviously must be the clock controller holding it low. This might just be happening because the controller is crashed but it gives a hint as to where it's crashing. AFAIK the clock controller only asserts Start on the bus when checking for presence of an RTC module. On a normal start-up the RTC check happens quite early in the start-up sequence, but when starting from a reset it doesn't happen until after the lengthy HV cal procedure has completed. Significant? Hmmm. Is the WiFi module doing something nasty on the bus when it observes the RTC check (observes passively, shouldn't participate), but only when it sees said check withing a couple milliseconds after power is applied? Thinking out loud here.....

Very strange problem, I have a lot to think about.

Also, that device you have for programming ESP-01S modules will work fine with regular ESP-01.

Look into it later when the dust is clearing off the crater.
Last edit: 2 months 2 weeks ago by Ty_Eeberfest.

Please Log in or Create an account to join the conversation.

Moderators: AccutronTy_EeberfestIan
Time to create page: 0.104 seconds
Go to top
JSN Boot template designed by JoomlaShine.com