2024-2025 Daybreak Software Retrospective

2024-2025 Daybreak Software Retrospective

Daybreak Firmware Problems:

  • Very old code was being used

  • Different implementations of code across different repos for no reason

  • Inconsistencies in formatting (especially with CAN messages)

  • A mix of interrupts and polling makes timings weird

CAN Message Scheme

Problems:

  • We had too many discrete CAN IDs that only had 1 or 2 bits. For example, we had a BPS_TRIP message, a BPS All Clear message, and a BPS Fault State message.

    • These messages could be combined into one longer message as a bitmap instead of taking up 3 slots of the limited amount of discrete CAN filter slots we have (~14)

    • This also makes programming easier since instead of 3 different messages we need to look for coming at different times, we could just look at one message

    • This can be weird in some places since messages different things in a message can be across different tasks, so you may introduce race conditions

  • CAN messages had crazy jitter and where often based on aperiodic things, ie a thread would block on some interrupt, and then send a CAN message right after that. 

    • We want timings to be as consistent as possible since inconsistent timings will be multiplied ten fold when we actually get on the car and will be very hard to debug

    • I'd suggest OS timers running in the background of the RTOS that send messages periodically since our next gen code is much more interrupt heavy than what it was in daybreak.

  • The CAN CSVs are dumb

    • DBCs are the standard for storing CAN information, and provide us a lot of good opportunities to autogenerate macros and functions for encoding and decoding CAN messages

    • Our goal is if you completely change the structure of a CAN message like the ordering of signals or what value corresponds to what you can just run a python script to generate a header file then your code will work with no changes needed

    • Could introduce possible CI/CD into github where we can see if there are conflicting IDs, or if there are valid DBCs

  • Priorities are organized by system rather than importance

    • Daybreak CAN ID priorities are organized by system (Controls gets 0x5 and BPS gets 0x1 addresses).

    • Unimportant messages for car operation like BPS SOC would then take precedent over a needed message for car operation like IO State

    • IDs needed for car operation should be higher priority

    • In CAN filters you can also filter based on an ID mask rather than a specific ID, which solves the limited CAN filter problem, so keep that in mind

      • messages needed for VCU should be like 0x11 so your CAN filter can mask off of 0x11 

Cool new things

  • Online DBC editor: https://www.csselectronics.com/pages/dbc-editor-can-bus-database

  • CAN muxing

    • We can have one ID and use one of the data bits to choose the purpose of the ID.

    • If I have 8 VoltTemp PCBs that are doing the exact same thing, instead of making 8 distinct CAN IDs I could just make 3 of the bits say which VoltTemp I'm receiving from

  • CAN Watchdogs

    • Now that we have a very strong reliance on CAN messages for car operation, we need to ensure CAN IDs are being properly sent and received.

    • Could make a driver that lets you make a timer that waits for a CAN ID to be received, if that CAN message is not received then call some error handler.

Debugging Improvements:

Problems:

  • Plugging in SWD headers can be difficult and nearly impossible when the top shell is on

  • UART printf is very intrusive, and effects internal timings and watchdogs

  • Current debug methods like GDB is very intrusive to processes

ESP32:

  • ESP32 wroom modules are very easy to program and the hardware is also very simple

  • We program with arduino IDE if we rlly wanted

  • ESPnow is a very easy way to get peer to peer communication between ESP32 devices, and doesn’t require a wifi connection

    • ESPnow only works peer to peer tho, so we need a reciever ESP32.

  • Ideally we can combine this with our UART bootloader to do OTA updates

  • Range is limited, so this will only be a debugging tool when you’re close by to the car

Bootloaders:

  • Enclosures can have panel mounted USB connectors, so a USB bootloader will allow us to flash a board without opening enclosures up

  • CAN bootloader will allow us to program any arbitrary board in the car while the top shell is on

    • Since we have several layered CANbusses (CarCAN, BPS CAN, Controls CAN, Sensor CAN), the ideal CAN bootloader allows us to program any node in the CANbus from the Dashboard since that is the only accessible enclosure when top shell is on

  • when the top shell is on we cannot access USB of any enclosure other than the dashboard so we need another method to flash boards

Software Considerations for Solar McQueen:

  • Queue’s are cringe if you only care about the most recent piece of data

  • Code functionality should be very easy to modify / add to

  • Use the STM32’s internal watchdog, to ensure a random task didn’t take control of your processor.

    • Basically you need to acknowledge a watchdog every n amount of seconds to say that you’re doing good, and if that watchdog is not acknowledged then you know some RTOS task has taken too long or has blocked the processor

    • You always want some guarantee that something important is running, whether that be through a watchdog or some external device waiting for a CAN message

  • In your fault state don’t suspend the scheduler

    • Our BSP is heavily built around the RTOS, so if we want to send CAN messages in a fault state then we need to have a new special implementation of CAN with no RTOS

Sharepoint Changes:

  • We should rely more on CubeMX generated code for initiializations and other things that don’t really matter in terms of performance

    • Mainly the Systemclock config should be handled by each project and the end user since you need to change the config based on external vs internal oscillator, which mcu it’s on, and some other factors. It’s not worth our time to make a generic solution and should just make end users generate their own clock config given their microcontroller

  • Remove hard coded pin defintions in sharepoint

    • Can be hard to trace pin definitions from random files if it creates conflicts

    • If any pins need to be hard coded, wrap them in #ifndef so the user can override it

  • MspInit:

    • The MSPInit function for a peripheral is called when you call the init function for said peripheral

    • If I call HAL_CAN_Init(CAN1), a weakly defined function called HAL_CAN_MspInit(CAN1) will be called.

    • For most peripherals, you can have a ton of different pin settings / configurations, so sharepoint should provide their own weak declaration and then the user will make their own declaration for their pin configuration

    • //Sharepoint weak void HAL_CAN_MspInit(){ // Init some random set of pins } // End user code void HAL_CAN_MspInit(){ // Init the pins I want } HAL_CAN_Init();
  • Adding a folder for common applications you can choose to include in your code

    • Some driver that takes control of a interrupt (like a pwm driver that steals the timer interrupt callback)

    • A nonblocking printf task

    • A task that waits for UART or CAN bootloader messages

  • Good documentation on how to set it up

    • Maybe doxygen and heavier requirements on bsp instructions?