STM32 and RTFM Daniel Silverstone

I have been working with STM32 chips on-and-off for at least eight, possibly closer to nine years. About as long as ST have been touting them around. I love the STM32, and have done much with them in C. But, as my previous two posts may have hinted, I would like to start working with Rust instead of C. To that end, I have been looking with great joy at the work which Jorge Aparicio has been doing around Cortex-M3 and Rust. I've had many comments in person at Debconf, and also several people mention on Twitter, that they're glad more people are looking at this. But before I can get too much deeper into trying to write my USB stack, I need to sort a few things from what Jorge has done as demonstration work.

Okay, this is fast, but we need Ludicrous speed

All of Jorge's examples seem to leave the system clocks in a fairly default state, excepting turning on the clocks to the peripherals needed during the initialisation phase. Sadly, if we're going to be running the USB at all, we need the clocks to run a tad faster. Since my goal is to run something moderately CPU intensive on the end of the USB too, it makes sense to try and get our STM32 running at maximum clock speed. For the one I have, that's 72MHz rather than the 8MHz it starts out with. Nine times more cycles to do computing in makes a lot of sense.

As I said above, I've been doing STM32 in C a lot for many years; and fortunately I have built systems with the exact chip that's on the blue-pill before. As such, if I rummage, I can find some old C code which does what we need...

    /* Enable HSE */
    RCC_HSEConfig(RCC_HSE_ON);

    /* Wait till HSE is ready */
    HSEStartUpStatus = RCC_WaitForHSEStartUp();

    if (HSEStartUpStatus == SUCCESS)
    {
      /* Enable Prefetch Buffer */
      FLASH_PrefetchBufferCmd(FLASH_PrefetchBuffer_Enable);
      /* Flash 2 wait state */
      FLASH_SetLatency(FLASH_Latency_2);

      /* HCLK = SYSCLK */
      RCC_HCLKConfig(RCC_SYSCLK_Div1);
      /* PCLK2 = HCLK */
      RCC_PCLK2Config(RCC_HCLK_Div1);
      /* PCLK1 = HCLK/2 */
      RCC_PCLK1Config(RCC_HCLK_Div2);
      /* ADCCLK = PCLK2/6 */
      RCC_ADCCLKConfig(RCC_PCLK2_Div6);
      /* PLLCLK = 8MHz * 9 = 72 MHz */
      RCC_PLLConfig(RCC_PLLSource_HSE_Div1, RCC_PLLMul_9);

      /* Enable PLL */
      RCC_PLLCmd(ENABLE);
      /* Wait till PLL is ready */
      while (RCC_GetFlagStatus(RCC_FLAG_PLLRDY) == RESET)
      {}

      /* Select PLL as system clock source */
      RCC_SYSCLKConfig(RCC_SYSCLKSource_PLLCLK);
      /* Wait till PLL is used as system clock source */
      while (RCC_GetSYSCLKSource() != 0x08)
      {}
    }

This code, rather conveniently, uses an 8MHz external crystal so we can almost direct-port it to the blue-pill Rust code and see how we go. If you're used to the CMSIS libraries for STM32, then you won't completely recognise the above since it uses the pre-CMSIS core libraries to do its thing. Library code from 2008 and it's still good on today's STM32s providing they're in the right family :-)

A direct conversion to Rust, using Jorge's beautifully easy to work with crates made from svd2rust results in:

    fn make_go_faster(rcc: &RCC, flash: &FLASH) {
        rcc.cr.modify(|_, w| w.hseon().enabled());
        while !rcc.cr.read().hserdy().is_ready() {}
        flash.acr.modify(|_, w| w.prftbe().enabled());
        flash.acr.modify(|_, w| w.latency().two());
        rcc.cfgr.modify(|_, w| w
                        .hpre().div1()
                        .ppre2().div1()
                        .ppre1().div2()
                        // .adcpre().bits(8)
                        .pllsrc().external()
                        .pllxtpre().div1()
                        .pllmul().mul9()
        );
        rcc.cr.modify(|_, w| w.pllon().enabled());
        while rcc.cr.read().pllrdy().is_unlocked() {}
        rcc.cfgr.modify(|_,w| w.sw().pll());
        while !rcc.cfgr.read().sws().is_pll() {}
    }

Now, I've not put the comments in which were in the C code, because I'm being very lazy right now, but if you follow the two together you should be able to work it through. I don't have timeouts for the waits, and you'll notice a single comment there (I cannot set up the ADC prescaler because for some reason the SVD is missing any useful information and so the generated crate only carries an unsafe function (bits()) and I'm trying to steer clear of unsafe for now. Still, I don't need the ADC immediately, so I'm okay with this.

By using this function in the beginning of the init() function of the blinky example, I can easily demonstrate the clock is going faster since the LED blinks more quickly.

This function demonstrates just how simple it is to take bit-manipulation from the C code and turn it into (admittedly bad looking) Rust with relative ease and without any of the actual bit-twiddling. I love it.

Mess with time, and you get unexpected consequences

Sadly, when you mess with the clock tree on a microcontroller, you throw a lot of things out of whack. Not least, by adjusting the clock frequency up we end up adjusting the AHB, APB1, and APB2 clock frequencies. This has direct consequences for peripherals floating around on those busses. Fortunately Jorge thought of this and while the blue-pill crate hard-wires those frequencies to 8MHz, they are, at least, configurable in code in some sense.

If we apply the make_go_faster() function to the serial loopback example, it simply fails to work because now the bus which the USART1 peripheral is connected to (APB2) is going at a different speed from the expected power-on default of 8MHz. If you remember from the function, we did .hpre().div1() which set HCLK to 72MHz, then .ppre1().div2() which sets the APB1 bus clock to be HCLK divided by 2, and .ppre2().div1() which sets APB2 bus clock to be HCLK. This means that we'd need to alter src/lib.rs to reflect these changes in the clock frequences and in theory loopback would start working once more.

It'd be awkward to try and demonstrate all that to you since I only have a phone camera to hand, but if you own a blue-pill then you can clone Jorge's repo and have a go yourself and see that I'm not bluffing you.

With all this done, it'll be time to see if we can bring the USB peripheral in the STM32 online, and that will be the topic of my next post in this discovery series.

USB Device Stacks, on RTFM, part 2 Daniel Silverstone

Previously we talked about all the different kinds of descriptors which USB devices use to communicate their capability. This is important stuff because to write any useful USB device firmware we need to be able to determine how to populate our descriptors. However, having that data on the device is entirely worthless without an understanding of how it gets from the device to the host so that it can be acted upon. To understand that, let's look at the USB wire protocol.

Note, I'll again be talking mostly about USB2.0 low- and full-speed. I believe that high speed is approximately the same but with faster wires, except not quite that simple.

Down to the wire

I don't intend to talk about the actual electrical signalling, though it's not un-reasonable for you to know that USB is a pair of wires forming a differentially signalled bidirectional serial communications link. The host is responsible for managing all the framing and timing on the link, and for formatting the communications into packets.

There are a number of packet types which can appear on the USB link:

Packet type Purpose
Token Packet When the host wishes to send a message to the Control endpoint to configure the device, read data IN, or write data OUT, it uses this to start the transaction.
Data(0/1) Packet Following a Setup, In, or Out token, a Data packet is a transfer of data (in either direction). The 0 and 1 alternate to provide a measure of confidence against lost packets.
Handshake Packet Following a data packet of some kind, the other end may ACK the packet (all was well), NAK the packet (report that the device cannot, temporarily, send/receive data, or that an interrupt endpoint isn't triggered), or STALL the bus in which case the host needs to intervene.
Start of Frame Every 1ms (full-speed) the host will send a SOF packet which carries a frame number. This can be used to help keep time on very simple devices. It also divides the bus into frames within which bandwidth is allocated.

As an example, when the host wishes to perform a control transfer, the following packets are transacted in turn:

  1. Setup Token - The host addresses the device and endpoint (OUT0)
  2. Data0 Packet - The host transmits a GET_DESCRIPTOR for the device descriptor
  3. Ack Packet - The device acknowledges receipt of the request

This marks the end of the first transaction. The device decodes the GET_DESCRIPTOR request and prepares the device descriptor for transmission. The transmission occurs as the next transaction on the bus. In this example, we're assuming 8 byte maximum transmission sizes, for illustrative purposes.

  1. In Token - The host addresses the device and endpoint (IN0)
  2. Data1 Packet - The device transmits the first 8 bytes of the descriptor
  3. Ack Packet - The host acknowledges the data packet
  4. In Token - The host addresses the device and endpoint (IN0)
  5. Data0 Packet - The device transmits the remaining 4 bytes of the descriptor (padded)
  6. Ack Packet - The host acknowledges the data packet

The second transaction is now complete, and the host has all the data it needs to proceed. Finally a status transaction occurs in which:

  1. Out Token - The host addresses the device and endpoint (OUT0)
  2. Data1 Packet - The host transmits a 0 byte data packet to indicate successful completion
  3. Ack Packet - The device acknowledges the completion, indicating its own satisfaction

And thus ends the full control transaction in which the host retrieves the device descriptor.

From a high level, we need only consider the activity which occurs at the point of the acknowledgement packets. In the above example:

  1. On the first ACK the device prepares IN0 to transmit the descriptor, readying whatever low level device stack there is with a pointer to the descriptor and its length in bytes.
  2. On the second ACK the low levels are still thinking.
  3. On the third ACK the transmission from IN0 is complete and the endpoint no longer expects to transfer data.
  4. On the fourth ACK the control transaction is entirely complete.

Thinking at the low levels of the control interface

Before we can build a high level USB stack, we need to consider the activity which might occur at the lower levels. At the low levels, particularly of the device control interface, work has to be done at each and every packet. The hardware likely deals with the token packet for us, leaving the data packets for us to process, and the resultant handshake packets will be likely handled by the hardware in response to our processing the data packets.

Since every control transaction is initiated by a setup token, let's look at the setup requests which can come our way...

Setup Packet (Data) Format
Field Name Byte start Byte length Encoding Meaning
bmRequestType 0 1 Bitmap Describes the kind of request, and the target of it. See below.
bRequest 1 1 Code The request code itself, meanings of the rest of the fields vary by bRequest
wValue 2 2 Number A 16 bit value whose meaning varies by request type
wIndex 4 2 Number A 16 bit value whose meaning varies by request type but typically encodes an interface number or endpoint.
wLength 6 2 Number A 16 bit value indicating the length of the transfer to come.

Since bRequest is essentially a switch against which multiple kinds of setup packet are selected between, here's the meanings of a few...

GET_DESCRIPTOR (Device) setup packet
Field Name Value Meaning
bmRequestType 0x08 Data direction is IN (from device to host), recipient is the device
bRequest 0x06 GET_DESCRIPTOR (in this instance, the device descriptor is requested)
wValue 0x0001 This means the device descriptor
wIndex 0x0000 Irrelevant, there's only 1 device descriptor anyway
wLength 12 This is the length of a device descriptor (12 bytes)
SET_ADDRESS to set a device's USB address
Field Name Value Meaning
bmRequestType 0x00 Data direction is OUT (from host to device), recipient is the device
bRequest 0x05 SET_ADDRESS (Set the device's USB address)
wValue 0x00nn The address for the device to adopt (max 127)
wIndex 0x0000 Irrelevant for address setting
wLength 0 There's no data transfer expected for this setup operation

Most hardware blocks will implement an interrupt at the point that the Data packet following the Setup packet has been receive. This is typically called receiving a 'Setup' packet and then it's up to the device stack low levels to determine what to do and dispatch a handler. Otherwise an interrupt will fire for the IN or OUT tokens and if the endpoint is zero, the low level stack will handle it once more.

One final thing worth noting about SET_ADDRESS is that it doesn't take effect until the completion of the zero-length "status" transaction following the setup transaction. As such, the status request from the host will still be sent to address zero (the default for new devices).

A very basic early "packet trace"

This is an example, and is not guaranteed to be the packet sequence in all cases. It's a good indication of the relative complexity involved in getting a fresh USB device onto the bus though...

When a device first attaches to the bus, the bus is in RESET state and so the first event a device sees is a RESET which causes it to set its address to zero, clear any endpoints, clear the configuration, and become ready for control transfers. Shortly after this, the device will become suspended.

Next, the host kicks in and sends a port reset of around 30ms. After this, the host is ready to interrogate the device.

The host sends a GET_DESCRIPTOR to the device, whose address at this point is zero. Using the information it receives from this, it can set up the host-side memory buffers since the device descriptor contains the maximum transfer size which the device supports.

The host is now ready to actually 'address' the device, and so it sends another reset to the device, again around 30ms in length.

The host sends a SET_ADDRESS control request to the device, telling it that its new address is nn. Once the acknowledgement has been sent from the host for the zero-data status update from the device, the device sets its internal address to the value supplied in the request. From now on, the device shall respond only to requests to nn rather than to zero.

At this point, the host will begin interrogating further descriptors, looking at the configuration descriptors and the strings, to build its host-side representation of the device. These will be GET_DESCRIPTOR and GET_STRING_DESCRIPTOR requests and may continue for some time.

Once the host has satisfied itself that it knows everything it needs to about the device, it will issue a SET_CONFIGURATION request which basically starts everything up in the device. Once the configuration is set, interrupt endpoints will be polled, bulk traffic will be transferred, Isochronous streams begin to run, etc.

Okay, but how do we make this concrete?

So far, everything we've spoken about has been fairly abstract, or at least "soft". But to transfer data over USB does require some hardware. (Okay, okay, we could do it all virtualised, but there's no fun in that). The hardware I'm going to be using for the duration of this series is the STM32 on the blue-pill development board. This is a very simple development board which does (in theory at least) support USB device mode.

If we view the schematic for the blue-pill, we can see a very "lightweight" USB interface which has a pullup resistor for D+. This is the way that a device signals to the host that it is present, and that it wants to speak at full-speed. If the pullup were on D- then it would be a low-speed device. High speed devices need a little more complexity which I'm not going to go into for today.

The USB lines connect to pins PA11 and PA12 which are the USB pins on the STM32 on the board. Since USB is quite finicky, the STM32 doesn't let you remap that function elsewhere, so this is all looking quite good for us so far.

The specific STM32 on the blue-pill is the STM32F103C8T6. By viewing its product page on ST's website we can find the reference manual for the part. Jumping to section 23 we learn that this STM32 supports full-speed USB2.0 which is convenient given the past article and a half. We also learn it supports up to eight endpoints active at any one time, and offers double-buffering for our bulk and isochronous transfers. It has some internal memory for packet buffering, so it won't use our RAM bandwidth while performing transfers, which is lovely.

I'm not going to distill the rest of that section here, because there's a large amount of data which explains how the USB macrocell operates. However useful things to note are:

  • How IN OUT and SETUP transfers work.
  • How the endpoint buffer memory is configured.
  • That all bus-powered devices MUST respond to suspend/resume properly
  • That the hardware will prioritise endpoint interrupts for us so that we only need deal with the most pressing item at any given time.
  • There is an 'Enable Function' bit in the address register which must be set or we won't see any transactions at all.
  • How the endpoint registers signal events to the device firmware.

Next time, we're going to begin the process of writing a very hacky setup routine to try and initialise the USB device macrocell so that we can see incoming transactions through the ITM. It should be quite exciting, but given how complex this will be for me to learn, it might be a little while before it comes through.

USB Device Stacks, on RTFM Daniel Silverstone

I have been spending time with Jorge Aparicio's RTFM for Cortex M3 framework for writing Rust to target Cortex-M3 devices from Arm (and particularly the STM32F103 from ST Microelectronics). Jorge's work in this area has been of interest to me ever since I discovered him working on this stuff a while ago. I am very tempted by the idea of being able to implement code for the STM32 with the guarantees of Rust and the language features which I have come to love such as the trait system.

I have been thinking to myself that, while I admire and appreciate the work done on the GNUK, I would like to, personally, have a go at implementing some kind of security token on an STM32 as a USB device. And with the advent of the RTFM for M3 work, and Jorge's magical tooling to make it easier to access and control the registers on an M3 microcontroller, I figured it'd be super-nice to do this in Rust, with all the advantages that entails in terms of isolating unsafe behaviour and generally having the potential to be more easily verified as not misbehaving.

To do this though, means that I need a USB device stack which will work in the RTFM framework. Sadly it seems that, thus-far, only Jorge has been working on drivers for any of the M3 devices his framework supports. And one person can only do so much. So, in my infinite madness, I decided I should investigate the complexity of writing a USB device stack in Rust for the RTFM/M3 framework. (Why I thought this was a good idea is lost to the mists of late night Googling, but hey, it might make a good talk at the next conference I go to). As such, this blog post, and further ones along these lines, will serve as a partial tour of what I'm up to, and a partial aide-memoir for me about learning USB. If I get something horribly wrong, please DO contact me to correct me, otherwise I'll just continue to be wrong. If I've simplified something but it's still strictly correct, just let me know if it's an oversimplification since in a lot of cases there's no point in me putting the full details into a blog posting. I will mostly be considering USB2.0 protocol details but only really for low and full speed devices. (The hardware I'm targetting does low-speed and full-speed, but not high-speed. Though some similar HW does high-speed too, I don't have any to hand right now)

A brief introduction to USB

In order to go much further, I needed a grounding in USB. It's a multi-layer protocol as you might expect, though we can probably ignore the actual electrical layer since any device we might hope to support will have to have a hardware block to deal with that. We will however need to consider the packet layer (since that will inform how the hardware block is implemented and thus its interface) and then the higher level protocols on top.

USB is a deliberately asymmetric protocol. Devices are meant to be significantly easier to implement, both in terms of hardware and software, as compared with hosts. As such, despite some STM32s having OTG ports, I have no intention of supporting host mode at this time.

USB is arranged into a set of busses which are, at least in the USB1.1 case, broadcast domains. As such, each device has an address assigned to it by the host during an early phase called 'configuration'. Once the address is assigned, the device is expected to only ever respond to messages addressed to it. Note that since everything is asymmetric in USB, the device can't send messages on its own, but has to be asked for them by the host, and as such the addressing is always from host toward device.

USB devices then expose a number of endpoints through which communication can flow IN to the host or OUT to the device. Endpoints are not bidirectional, but the in and out endpoints do overlap in numbering. There is a special pair of endpoints, IN0 and OUT0 which, between them, form what I will call the device control endpoints. The device control endpoints are important since every USB device MUST implement them, and there are a number of well defined messages which pass over them to control the USB device. In theory a bare minimum USB device would implement only the device control endpoints.

Configurations, and Classes, and Interfaces, Oh My!

In order for the host to understand what the USB device is, and what it is capable of, part of the device control endpoints' responsibility is to provide a set of descriptors which describe the device. These descriptors form a heirarchy and are then glommed together into a big lump of data which the host can download from the device in order to decide what it is and how to use it. Because of various historical reasons, where a multi-byte value is used, they are defined to be little-endian, though there are some BCD fields. Descriptors always start with a length byte and a type byte because that way the host can parse/skip as necessary, with ease.

The first descriptor is the device descriptor, is a big one, and looks like this:

Device Descriptor
Field Name Byte start Byte length Encoding Meaning
bLength 0 1 Number Size of the descriptor in bytes (18)
bDescriptorType 1 1 Constant Device Descriptor (0x01)
bcdUSB 2 2 BCD USB spec version compiled with
bDeviceClass 4 1 Class Code, assigned by USB org (0 means "Look at interface descriptors", common value is 2 for CDC)
bDeviceSubClass 5 1 SubClass Code, assigned by USB org (usually 0)
bDeviceProtocol 6 1 Protocol Code, assigned by USB org (usually 0)
bMaxPacketSize 7 1 Number Max packet size for IN0/OUT0 (Valid are 8, 16, 32, 64)
idVendor 8 2 ID 16bit Vendor ID (Assigned by USB org)
idProduct 10 2 ID 16bit Product ID (Assigned by manufacturer)
bcdDevice 12 2 BCD Device version number (same encoding as bcdUSB)
iManufacturer 14 1 Index String index of manufacturer name (0 if unavailable)
iProduct 15 1 Index String index of product name (0 if unavailable)
iSerialNumber 16 1 Index String index of device serial number (0 if unavailable)
bNumConfigurations 17 1 Number Count of configurations the device has.

This looks quite complex, but breaks down into a relatively simple two halves. The first eight bytes carries everything necessary for the host to be able to configure itself and the device control endpoints properly in order to communicate effectively. Since eight bytes is the bare minimum a device must be able to transmit in one go, the host can guarantee to get those, and they tell it what kind of device it is, what USB protocol it supports, and what the maximum transfer size is for its device control endpoints.

The encoding of the bcdUSB and bcdDevice fields is interesting too. It is of the form 0xMMmm where MM is the major number, mm the minor. So USB2.0 is encoded as 0x0200, USB1.1 as 0x0110 etc. If the device version is 17.36 then that'd be 0x1736.

Other fields of note are bDeviceClass which can be 0 meaning that interfaces will specify their classes, and idVendor/idProduct which between them form the primary way for the specific USB device to be identified. The Index fields are indices into a string table which we'll look at later. For now it's enough to know that wherever a string index is needed, 0 can be provided to mean "no string here".

The last field is bNumConfigurations and this indicates the number of ways in which this device might function. A USB device can provide any number of these configurations, though typically only one is provided. If the host wishes to switch between configurations then it will have to effectively entirely quiesce and reset the device.

The next kind of descriptor is the configuration descriptor. This one is much shorter, but starts with the same two fields:

Configuration Descriptor
Field Name Byte start Byte length Encoding Meaning
bLength 0 1 Number Size of the descriptor in bytes (9)
bDescriptorType 1 1 Constant Configuration Descriptor (0x02)
wTotalLength 2 2 Number Size of the configuration in bytes, in total
bNumInterfaces 4 1 Number The number of interfaces in this configuration
bConfigurationValue 5 1 Number The value to use to select this configuration
iConfiguration 6 1 Index The name of this configuration (0 for unavailable)
bmAttributes 7 1 Bitmap Attributes field (see below)
bMaxPower 8 1 Number Maximum bus power this configuration will draw (in 2mA increments)

An important field to consider here is the bmAttributes field which tells the host some useful information. Bit 7 must be set, bit 6 is set if the device would be self-powered in this configuration, bit 5 indicates that the device would like to be able to wake the host from sleep mode, and bits 4 to 0 must be unset.

The bMaxPower field is interesting because it encodes the power draw of the device (when set to this configuration). USB allows for up to 100mA of draw per device when it isn't yet configured, and up to 500mA when configured. The value may be used to decide if it's sensible to configure a device if the host is in a low power situation. Typically this field will be set to 50 to indicate the nominal 100mA is fine, or 250 to request the full 500mA.

Finally, the wTotalLength field is interesting because it tells the host the total length of this configuration, including all the interface and endpoint descriptors which make it up. With this field, the host can allocate enough RAM to fetch the entire configuration descriptor block at once, simplifying matters dramatically for it.

Each configuration has one or more interfaces. The interfaces group some endpoints together into a logical function. For example a configuration for a multifunction scanner/fax/printer might have an interface for the scanner function, one for the fax, and one for the printer. Endpoints are not shared among interfaces, so when building this table, be careful.

Next, logically, come the interface descriptors:

Interface Descriptor
Field Name Byte start Byte length Encoding Meaning
bLength 0 1 Number Size of the descriptor in bytes (9)
bDescriptorType 1 1 Constant Interface Descriptor (0x04)
bInterfaceNumber 2 1 Number The number of the interface
bAlternateSetting 3 1 Number The interface alternate index
bNumEndpoints 4 1 Number The number of endpoints in this interface
bInterfaceClass 5 1 Class The interface class (USB Org defined)
bInterfaceSubClass 6 1 SubClass The interface subclass (USB Org defined)
bInterfaceProtocol 7 1 Protocol The interface protocol (USB Org defined)
iInterface 8 1 Index The name of the interface (or 0 if not provided)

The important values here are the class/subclass/protocol fields which provide a lot of information to the host about what the interface is. If the class is a USB Org defined one (e.g. 0x02 for Communications Device Class) then the host may already have drivers designed to work with the interface meaning that the device manufacturer doesn't have to provide host drivers.

The bInterfaceNumber is used by the host to indicate this interface when sending messages, and the bAlternateSetting is a way to vary interfaces. Two interfaces with the came bInterfaceNumber but different bAlternateSettings can be switched between (like configurations, but) without resetting the device.

Hopefully the rest of this descriptor is self-evident by now.

The next descriptor kind is endpoint descriptors:

Endpoint Descriptor
Field Name Byte start Byte length Encoding Meaning
bLength 0 1 Number Size of the descriptor in bytes (7)
bDescriptorType 1 1 Constant Endpoint Descriptor (0x05)
bEndpointAddress 2 1 Endpoint Endpoint address (see below)
bmAttributes 3 1 Bitmap Endpoint attributes (see below)
wMaxPacketSize 4 2 Number Maximum packet size this endpoint can send/receive
bInterval 6 1 Number Interval for polling endpoint (in frames)

The bEndpointAddress is a 4 bit endpoint number (so there're 16 endpoint indices) and a bit to indicate IN vs. OUT. Bit 7 is the direction marker and bits 3 to 0 are the endpoint number. This means there are 32 endpoints in total, 16 in each direction, 2 of which are reserved (IN0 and OUT0) giving 30 endpoints available for interfaces to use in any given configuration. The bmAttributes bitmap covers the transfer type of the endpoint (more below), and the bInterval is an interval measured in frames (1ms for low or full speed, 125µs in high speed). bInterval is only valid for some endpoint types.

The final descriptor kind is for the strings which we've seen indices for throughout the above. String descriptors have two forms:

String Descriptor (index zero)
Field Name Byte start Byte length Encoding Meaning
bLength 0 1 Number Size of the descriptor in bytes (variable)
bDescriptorType 1 1 Constant String Descriptor (0x03)
wLangID[0] 2 2 Number Language code zero (e.g. 0x0409 for en_US)
wLangID[n] 4.. 2 Number Language code n ...

This form (for descriptor 0) is that of a series of language IDs supported by the device. The device may support any number of languages. When the host requests a string descriptor, it will supply both the index of the string and also the language id it desires (from the list available in string descriptor zero). The host can tell how many language IDs are available simply by dividing bLength by 2 and subtracting 1 for the two header bytes.

And for string descriptors of an index greater than zero:

String Descriptor (index greater than zero)
Field Name Byte start Byte length Encoding Meaning
bLength 0 1 Number Size of the descriptor in bytes (variable)
bDescriptorType 1 1 Constant String Descriptor (0x03)
bString 2.. .. Unicode The string, in "unicode" format

This second form of the string descriptor is simply the the string is in what the USB spec calls 'Unicode' format which is, as of 2005, defined to be UTF16-LE without a BOM or terminator.

Since string descriptors are of a variable length, the host must request strings in two transactions. First a request for 2 bytes is sent, retrieving the bLength and bDescriptorType fields which can be checked and memory allocated. Then a request for bLength bytes can be sent to retrieve the entire string descriptor.

Putting that all together

Phew, this is getting to be quite a long posting, so I'm going to leave this here and in my next post I'll talk about how the host and device pass packets to get all that information to the host, and how it gets used.

Gitano 1.1 Daniel Silverstone

Today marks the release of Gitano 1.1. Richard(s) and I have spent quite a lot of time and effort on this release, and there's plenty of good stuff in it. We also released new versions of Lace, Supple, Luxio, and Gall to go alongside it, with bugfixes and improvements.

At this point, I intend to take a short break from Gitano to investigate some Rust-on-STM32 stuff, and then perhaps do some NetSurf work too.

F/LOSS activity, July 2017 Daniel Silverstone

Once again, my focus was on Gitano, which we're working toward a 1.1 for. We had another one of our Gitano developer days which was attended by Richard maw and myself. You are invited to read the wiki page but a summary of what happened, which directly involved me, is:

  • Once again, we reviewed our current task state
  • We had a good discussion about our code of conduct including adopting a small change from upstream to improve matters
  • I worked on, and submitted a patch for, improving nested error message reports in Lace.
  • I reviewed and merged some work from Richard about pattern centralisation
  • I responded to comments on a number of in-flight series Richard had reviewed for me.
  • We discussed our plans for 1.1 and agreed that we'll be skipping a developer day in August because so much of it is consumed by DebConf and so on.

Other than that, related to Gitano during July I:

  • Submitted some code series before the developer day covering Gall cleanups and hook support in Gitano.
  • Reviewed and merged some more Makefile updates from Richard Ipsum
  • Reviewed and merged a Supple fix for environment cleardown from Richard Ipsum
  • Fixed an issue in one of the Makefiles which made it harder to build with dh-lua
  • I began work in earnest on Gitano CI, preparing a lot of scripts and support to sit around Jenkins (for now) for CIing packaging etc for Gitano and Debian
  • I began work on a system branch concept for Gitano CI which will let us handle the CI of branches in the repos, even if they cross repos.

I don't think I've done much non-Gitano F/LOSS work in July, but I am now in Montréal for debconf 2017 so hopefully more to say next month.

Yay, finished my degree at last Daniel Silverstone

A little while back, in June, I sat my last exam for what I hoped would be the last module in my degree. For seven years, I've been working on a degree with the Open University and have been taking advantage of the opportunity to have a somewhat self-directed course load by taking the 'Open' degree track. When asked why I bothered to do this, I guess my answer has been a little varied. In principle it's because I felt like I'd already done a year's worth of degree and didn't want it wasted, but it's also because I have been, in the dim and distant past, overlooked for jobs simply because I had no degree and thus was an easy "bin the CV".

Fed up with this, I decided to commit to the Open University and thus began my journey toward 'qualification' in 2010. I started by transferring the level 1 credits from my stint at UCL back in 1998/1999 which were in a combination of basic programming in Java, some mathematics including things like RSA, and some psychology and AI courses which at the time were aiming at a degree called 'Computer Science with Cognitive Sciences'.

Then I took level 2 courses, M263 (Building blocks of software), TA212 (The technology of music) and MS221 (Exploring mathematics). I really enjoyed the mathematics course and so...

At level 3 I took MT365 (Graphs, networks and design), M362 (Developing concurrent distributed systems), TM351 (Data management and analysis - which I ended up hating), and finally finishing this June with TM355 (Communications technology).

I received an email this evening telling me the module result for TM355 had been posted, and I logged in to find I had done well enough to be offered my degree. I could have claimed my degree 18+ months ago, but I persevered through another two courses in order to qualify for an honours degree which I have now been awarded. Since I don't particularly fancy any ceremonial awarding, I just went through the clicky clicky and accepted my qualification of 'Batchelor of Science (Honours) Open, Upper Second-class Honours (2.1)' which grants me the letters 'BSc (Hons) Open (Open)' which, knowing me, will likely never even make it onto my CV because I'm too lazy.

It has been a significant effort, over the course of the past few years, to complete a degree without giving up too much of my personal commitments. In addition to earning the degree, I have worked, for six of the seven years it has taken, for Codethink doing interesting work in and around Linux systems and Trustable software. I have designed and built Git server software which is in use in some universities, and many companies, along with a good few of my F/LOSS colleagues. And I've still managed to find time to attend plays, watch films, read an average of 2 novel-length stories a week (some of which were even real books), and be a member of the Manchester Hackspace.

Right now, I'm looking forward to a stress free couple of weeks, followed by an immense amount of fun at Debconf17 in Montréal!

F/LOSS activity, June 2017 Daniel Silverstone

It seems to be becoming popular to send a mail each month detailing your free software work for that month. I have been slowly ramping my F/LOSS activity back up, after years away where I worked on completing my degree. My final exam for that was in June 2017 and as such I am now in a position to try and get on with more F/LOSS work.

My focus, as you might expect, has been on Gitano which reached 1.0 in time for Stretch's release and which is now heading gently toward a 1.1 release which we have timed for Debconf 2017. My friend a colleague Richard has been working hard on Gitano and related components during this time too, and I hope that Debconf will be an opportunity for him to meet many of my Debian friends too. But enough of that, back to the F/LOSS.

We've been running Gitano developer days roughly monthly since March of 2017, and the June developer day was attended by myself, Richard Maw, and Richard Ipsum. You are invited to read the wiki page for the developer day if you want to know exactly what we got up to, but a summary of my involvement that day is:

  • I chaired the review of our current task state for the project
  • I chaired the decision on the 1.1 timeline.
  • I completed a code branch which adds rudimentary hook support to Gitano and submitted it for code review.
  • I began to learn about git-multimail since we have a need to add support for it to Gitano

Other than that, related to Gitano during June I:

  • Reviewed Richard Ipsum's lua-scrypt patches for salt generation
  • Reviewed Richard Maw's work on centralising Gitano's patterns into a module.
  • Responded to reviews of my hook work, though I need to clean it up some before it'll be merged.

My non-Gitano F/LOSS related work in June has been entirely centred around the support I provide to the Lua community in the form of the Lua mailing list and website. The host on which it's run is ailing, and I've had to spend time trying to improve and replace that.

Hopefully I'll have more to say next month. Perhaps by doing this reporting I'll get more F/LOSS done. Of course, July's report will be sent out while I'm in Montréal for debconf 2017 (or at least for debcamp at that point) so hopefully more to say anyway.

Yarn architecture discussion Daniel Silverstone

Recently Rob and I visited Soile and Lars. We had a lovely time wandering around Helsinki with them, and I also spent a good chunk of time with Lars working on some design and planning for the Yarn test specification and tooling. You see, I wrote a Rust implementation of Yarn called rsyarn "for fun" and in doing so I noted a bunch of missing bits in the understanding Lars and I shared about how Yarn should work. Lars and I filled, and re-filled, a whiteboard with discussion about what the 'Yarn specification' should be, about various language extensions and changes, and also about what functionality a normative implementation of Yarn should have.

This article is meant to be a write-up of all of that discussion, but before I start on that, I should probably summarise what Yarn is.


Yarn is a mechanism for specifying tests in a form which is more like documentation than code. Yarn follows the concept of BDD story based design/testing and has a very Cucumberish scenario language in which to write tests. Yarn takes, as input, Markdown documents which contain code blocks with Yarn tests in them; and it then runs those tests and reports on the scenario failures/successes.

As an example of a poorly written but still fairly effective Yarn suite, you could look at Gitano's tests or perhaps at Obnam's tests (rendered as HTML). Yarn is not trying to replace unit testing, nor other forms of testing, but rather seeks to be one of a suite of test tools used to help validate software and to verify integrations. Lars writes Yarns which test his server setups for example.

As an example, lets look at what a simple test might be for the behaviour of the /bin/true tool:

SCENARIO true should exit with code zero

WHEN /bin/true is run with no arguments
THEN the exit code is 0
 AND stdout is empty
 AND stderr is empty

Anyone ought to be able to understand exactly what that test is doing, even though there's no obvious code to run. Yarn statements are meant to be easily grokked by both developers and managers. This should be so that managers can understand the tests which verify that requirements are being met, without needing to grok python, shell, C, or whatever else is needed to implement the test where the Yarns meet the metal.

Obviously, there needs to be a way to join the dots, and Yarn calls those things IMPLEMENTS, for example:

IMPLEMENTS WHEN (\S+) is run with no arguments
set +e
"${MATCH_1}" > "${DATADIR}/stdout" 2> "${DATADIR}/stderr"
echo $? > "${DATADIR}/exitcode"

As you can see from the example, Yarn IMPLEMENTS can use regular expressions to capture parts of their invocation, allowing the test implementer to handle many different scenario statements with one implementation block. For the rest of the implementation, whatever you assume about things will probably be okay for now.


Given all of the above, we (Lars and I) decided that it would make a lot of sense if there was a set of Yarn scenarios which could validate a Yarn implementation. Such a document could also form the basis of a Yarn specification and also a manual for writing reasonable Yarn scenarios. As such, we wrote up a three-column approach to what we'd need in that test suite.

Firstly we considered what the core features of the Yarn language are:

  • Scenario statements themselves (SCENARIO, GIVEN, WHEN, THEN, ASSUMING, FINALLY, AND, IMPLEMENTS, EXAMPLE, ...)
  • Whitespace normalisation of statements
  • Regexp language and behaviour
  • IMPLEMENTS current directory, data directory, home directory, and also environment.
  • Error handling for the statements, or for missing IMPLEMENTS
  • File (and filename) encoding
  • Labelled code blocks (since commonmark includes the backtick code block kind)
  • Exactly one IMPLEMENTS per statement

We considered unusual (or corner) cases and which of them needed defining in the short to medium term:

  • Statements before any SCENARIO or IMPLEMENTS
  • Meaning of split code blocks (concatenation?)
  • Meaning of code blocks not at the top level of a file (ignore?)
  • Meaning of HTML style comments in markdown files
  • Odd scenario ordering (e.g. ASSUMING at the end, or FINALLY at the start)
  • Meaning of empty lines in code blocks or between them.

All of this comes down to how to interpret input to a Yarn implementation. In addition there were a number of things we felt any "normative" Yarn implementation would have to handle or provide in order to be considered useful. It's worth noting that we don't specify anything about an implementation being a command line tool though...

  • Interpreter for IMPLEMENTS (and arguments for them)
  • "Library" for those implementations
  • Ability to require that failed ASSUMING statements lead to an error
  • A way to 'stop on first failure'
  • A way to select a specific scenario to run, from a large suite.
  • Generation of timing reports (per scenario and also per statement)
  • A way to 'skip' missing IMPLEMENTS
  • A clear way to identify the failing step in a scenario.
  • Able to treat multiple input files as a single suite.

There's bound to be more, but right now with the above, we believe we have two roughly conformant Yarn implementations. Lars' Python based implementation which lives in cmdtest (and which I shall refer to as pyyarn for now) and my Rust based one (rsyarn).


One thing which rsyarn supports, but pyyarn does not, is running multiple scenarios in parallel. However when I wrote that support into rsyarn I noticed that there were plenty of issues with running stuff in parallel. (A problem I'm sure any of you who know about threads will appreciate).

One particular issue was that scenarios often need to share resources which are not easily sandboxed into the ${DATADIR} provided by Yarn. For example databases or access to limited online services. Lars and I had a good chat about that, and decided that a reasonable language extension could be:

USING database foo

with its counterpart

RESOURCE database (\S+)
LABEL database-$1
GIVEN a database called $1
FINALLY database $1 is torn down

The USING statement should be reasonably clear in its pairing to a RESOURCE statement. The LABEL statement I'll get to in a moment (though it's only relevant in a RESOURCE block, and the rest of the statements are essentially substituted into the calling scenario at the point of the USING.

This is nowhere near ready to consider adding to the specification though. Both Lars and I are uncomfortable with the $1 syntax though we can't think of anything nicer right now; and the USING/RESOURCE/LABEL vocabulary isn't set in stone either.

The idea of the LABEL is that we'd also require that a normative Yarn implementation be capable of specifying resource limits by name. E.g. if a RESOURCE used a LABEL foo then the caller of a Yarn scenario suite could specify that there were 5 foos available. The Yarn implementation would then schedule a maximum of 5 scenarios which are using that label to happen simultaneously. At bare minimum it'd gate new users, but at best it would intelligently schedule them.

In addition, since this introduces the concept of parallelism into Yarn proper, we also wanted to add a maximum parallelism setting to the Yarn implementation requirements; and to specify that any resource label which was not explicitly set had a usage limit of 1.


Once we'd discussed the parallelism, we decided that once we had a nice syntax for expanding these sets of statements anyway, we may as well have a syntax for specifying scenario language expansions which could be used to provide something akin to macros for Yarn scenarios. What we came up with as a starter-for-ten was:

CALLING write foo

paired with

EXPANDING write (\S+)
GIVEN bar
WHEN $1 is written to
THEN success was had by all

Again, the CALLING/EXPANDING keywords are not fixed yet, nor is the $1 type syntax, though whatever is used here should match the other places where we might want it.


Finally we discussed multi-line inputs in Yarn. We currently have a syntax akin to:

GIVEN foo
... bar
... baz

which is directly equivalent to:

GIVEN foo bar baz

and this is achieved by collapsing the multiple lines and using the whitespace normalisation functionality of Yarn to replace all whitespace sequences with single space characters. However this means that, for example, injecting chunks of YAML into a Yarn scenario is a pain, as would be including any amount of another whitespace-sensitive input language.

After a lot of to-ing and fro-ing, we decided that the right thing to do would be to redefine the ... Yarn statement to be whitespace preserving and to then pass that whitespace through to be matched by the IMPLEMENTS or whatever. In order for that to work, the regexp matching would have to be defined to treat the input as a single line, allowing . to match \n etc.

Of course, this would mean that the old functionality wouldn't be possible, so we considered allowing a \ at the end of a line to provide the current kind of behaviour, rewriting the above example as:

GIVEN foo \
bar \
baz

It's not as nice, but since we couldn't find any real uses of ... in any of our Yarn suites where having the whitespace preserved would be an issue, we decided it was worth the pain.


None of the above is, as of yet, set in stone. This blog posting is about me recording the information so that it can be referred to; and also to hopefully spark a little bit of discussion about Yarn. We'd welcome emails to our usual addresses, being poked on Twitter, or on IRC in the common spots we can be found. If you're honestly unsure of how to get hold of us, just comment on this blog post and I'll find your message eventually.

Hopefully soon we can start writing that Yarn suite which can be used to validate the behaviour of pyyarn and rsyarn and from there we can implement our new proposals for extending Yarn to be even more useful.

Gitano - Approaching Release - Deprecated commands Daniel Silverstone

As mentioned previously I am working toward getting Gitano into Stretch. Last time we spoke about lace, on which a colleague and friend of mine (Richard Maw) did a large pile of work. This time I'm going to discuss deprecation approaches and building more capability out of fewer features.

First, a little background -- Gitano is written in Lua which is a deliberately small language whose authors spend more time thinking about what they can remove from the language spec than they do what they could add in. I first came to Lua in the 3.2 days, a little before 4.0 came out. (The authors provide a lovely timeline in case you're interested.) With each of the releases of Lua which came after 3.2, I was struck with how the authors looked to take a number of features which the language had, and collapse them into more generic, more powerful, smaller, fewer features.

This approach to design stuck with me over the subsequent decade, and when I began Gitano I tried to have the smallest number of core features/behaviours, from which could grow the power and complexity I desired. Gitano is, at its core, a set of files in a single format (clod) stored in a consistent manner (Git) which mediate access to a resource (Git repositories). Some of those files result in emergent properties such as the concept of the 'owner' of a repository (though that can simply be considered the value of the project.owner property for the repository). Indeed the concept of the owner of a repository is a fiction generated by the ACL system with a very small amount of collusion from the core of Gitano. Yet until recently Gitano had a first class command set-owner which would alter that one configuration value.

[gitano]  set-description ---- Set the repo's short description (Takes a repo)
[gitano]         set-head ---- Set the repo's HEAD symbolic reference (Takes a repo)
[gitano]        set-owner ---- Sets the owner of a repository (Takes a repo)

Those of you with Gitano installations may see the above if you ask it for help. Yet you'll also likely see:

[gitano]           config ---- View and change configuration for a repository (Takes a repo)

The config command gives you access to the repository configuration file (which, yes, you could access over git instead, but the config command can be delegated in a more fine-grained fashion without having to write hooks). Given the config command has all the functionality of the three specific set-* commands shown above, it was time to remove the specific commands.

Migrating

If you had automation which used the set-description, set-head, or set-owner commands then you will want to switch to the config command before you migrate your server to the current or any future version of Gitano.

In brief, where you had:

ssh git@gitserver set-FOO repo something

You now need:

ssh git@gitserver config repo set project.FOO something

It looks a little more wordy but it is consistent with the other features that are keyed from the project configuration, such as:

ssh git@gitserver config repo set cgitrc.section Fooble Section Name

And, of course, you can see what configuration is present with:

ssh git@gitserver config repo show

Or look at a specific value with:

ssh git@gitserver config repo show specific.key

As always, you can get more detailed (if somewhat cryptic) help with:

ssh git@gitserver help config

Next time I'll try and touch on the new PGP/GPG integration support.

Gitano - Approaching Release - Access Control Changes Daniel Silverstone

As mentioned previously I am working toward getting Gitano into Stretch. A colleague and friend of mine (Richard Maw) did a large pile of work on Lace to support what we are calling sub-defines. These let us simplify Gitano's ACL files, particularly for individual projects.

In this posting, I'd like to cover what has changed with the access control support in Gitano, so if you've never used it then some of this may make little sense. Later on, I'll be looking at some better user documentation in conjunction with another friend of mine (Lars Wirzenius) who has promised to help produce a basic administration manual before Stretch is totally frozen.

Sub-defines

With a more modern lace (version 1.3 or later) there is a mechanism we are calling 'sub-defines'. Previously if you wanted to write a ruleset which said something like "Allow Steve to read my repository" you needed:

define is_steve user exact steve
allow "Steve can read my repo" is_steve op_read

And, as you'd expect, if you also wanted to grant read access to Jeff then you'd need yet set of defines:

define is_jeff user exact jeff
define is_steve user exact steve
define readers anyof is_jeff is_steve
allow "Steve and Jeff can read my repo" readers op_read

This, while flexible (and still entirely acceptable) is wordy for small rulesets and so we added sub-defines to create this syntax:

allow "Steve and Jeff can read my repo" op_read [anyof [user exact jeff] [user exact steve]]

Of course, this is generally neater for simpler rules, if you wanted to add another user then it might make sense to go for:

define readers anyof [user exact jeff] [user exact steve] [user exact susan]
allow "My friends can read my repo" op_read readers

The nice thing about this sub-define syntax is that it's basically usable anywhere you'd use the name of a previously defined thing, they're compiled in much the same way, and Richard worked hard to get good error messages out from them just in case.

No more auto_user_XXX and auto_group_YYY

As a result of the above being implemented, the support Gitano previously grew for automatically defining users and groups has been removed. The approach we took was pretty inflexible and risked compilation errors if a user was deleted or renamed, and so the sub-define approach is much much better.

If you currently use auto_user_XXX or auto_group_YYY in your rulesets then your upgrade path isn't bumpless but it should be fairly simple:

  1. Upgrade your version of lace to 1.3
  2. Replace any auto_user_FOO with [user exact FOO] and similarly for any auto_group_BAR to [group exact BAR].
  3. You can now upgrade Gitano safely.

No more 'basic' matches

Since Gitano first gained support for ACLs using Lace, we had a mechanism called 'simple match' for basic inputs such as groups, usernames, repo names, ref names, etc. Simple matches looked like user FOO or group !BAR. The match syntax grew more and more arcane as we added Lua pattern support refs ~^refs/heads/${user}/. When we wanted to add proper PCRE regex support we added a syntax of the form: user pcre ^/.+?... where pcre could be any of: exact, prefix, suffix, pattern, or pcre. We had a complex set of rules for exactly what the sigils at the start of the match string might mean in what order, and it was getting unwieldy.

To simplify matters, none of the "backward compatibility" remains in Gitano. You instead MUST use the what how with match form. To make this slightly more natural to use, we have added a bunch of aliases: is for exact, starts and startswith for prefix, and ends and endswith for suffix. In addition, kind of match can be prefixed with a ! to invert it, and for natural looking rules not is an alias for !is.

This means that your rulesets MUST be updated to support the more explicit syntax before you update Gitano, or else nothing will compile. Fortunately this form has been supported for a long time, so you can do this in three steps.

  1. Update your gitano-admin.git global ruleset. For example, the old form of the defines used to contain define is_gitano_ref ref ~^refs/gitano/ which can trivially be replaced with: define is_gitano_ref ref prefix refs/gitano/
  2. Update any non-zero rulesets your projects might have.
  3. You can now safely update Gitano

If you want a reference for making those changes, you can look at the Gitano skeleton ruleset which can be found at https://git.gitano.org.uk/gitano.git/tree/skel/gitano-admin/rules/ or in /usr/share/gitano if Gitano is installed on your local system.

Next time, I'll likely talk about the deprecated commands which are no longer in Gitano, and how you'll need to adjust your automation to use the new commands.

This blog is powered by ikiwiki.