Micro-controller unit, shorten for MCU, is one type of integrated chip that consists of a CPU, programmable memories and varieties of peripherals; In a nutshell, it's a collection of useful functions/sub-circuitries linked together by a central processor unit, CPU. In other words, a MCU is a more abstract way to build sophisticated circuits with ease and speed. 8-bit AVR is one of these MCUs distinguished by its Harvard reduced instructions set computer architecture or RISC. There are 5 things to take note here: 1, A MCU can be considered as a computer but with very limited processing power or for an AVR with a 1 MIPS (millions of instructions per second) processing speed, a number very low compared to today's high end processors of up to 300k MIPS. 2, a Harvard CPU architecture is such that program memory and data memory are separated. Technically, some AVRs are more precise a modified version of the Harvard CPU architecture because their LPM instruction can still read data from the program memory space and so program memory can be allocated as constant data memory and their SPM instruction can write to itself into its own program memory space, the so called self-programmable feature. 3, the RISC architecture is very simple in its design philosophy, and the advantage is that instructions are predictable, so pipeline is possible, the so called fetch and execute, with most instructions executed in one clock cycle. The disadvantage is that code size is much larger for more complex operations 4, since program and data memory are separated, their bus width can be different. For 8-bit AVR, program memory is 16-bit or 1 word or 2 bytes wide, while data memory is 8-bit, so it's an 8-bit MCU. 5, finally, the data memory space is arranged with a cache file consists of 32 registers, following by the SFR, special I/O file registers, and then the sRAM. The peripherals and the control of the MCU are mainly accessed by reading and loading new values to the SFR. The SFR consists of 64-224 bytes of 8-bit data.
SFR can be considered as an input/output file, yet the physical input/output pins are also controlled by the SFR. For instance, to toggle a led connected to port B, pin 0 or PB0, the direction of PB0 needs to be configured as output, and PB0 is toggled to the desired frequency and it is done so by 2 SFRs: DDRB and PORTB. Both of these are 8-bit data inside the SFR. The memory location for DDRB for an Atmega8 is 0x17 (0x37). 0x37 is the real address location, while 0x17 is the offset. 0x37-0x17=0x20. 0x20 is a hexadecimal number and is 32 in the decimal system, since data space addressing is arranged like so: 32+(64-224)+(sRAM). This is only relevant when programming using assembly. For c program, use the predefined keywords found in the datasheet. These are in fact dereferenced volatile 8-bit pointers.
The CPU may seem complicated, but in reality what it does is to load data from program memory space and run them sequentially and it's done so by the program counter or shorthand PC; PC is nothing but a number pointed to the actual memory location in program memory for the CPU to fetch next instruction. To make programming more manageable, most codes are separated by code blocks, and to run any particular block of code, the PC is set to the first memory location of that certain block, and this is very similar to accessing an array in higher level. The PC increments itself automatically once the current instruction is executed. For the majority of the time, the running code sequence is predictable, but that may not be good. Interrupt is a way to break this main routine and to start running a new code block based on its corresponding interrupt location. It is done so by using the stack. The stack is a temporary data space within the sRAM. A hard concept to grab for beginners, it is merely a simplified mechanism to save the current PC in a user defined location inside the sRAM before jumping into new interrupt PC location and load the saved PC upon exiting the interrupt subroutine and this is all done so in the CPU hardware level using the PUSH and POP instructions. This hardware level concept is so useful that modern programming languages have their own software-implementations of their own stacks. In essence, interrupt is a time-sensitive/event-activated code block.
Timer is a multifunctional peripheral. For starters, it's a counter. Once configured, it will run independently to the CPU. An interrupt can be enabled to trigger a timed event at a guaranteed fixed interval, or the main code routine can check for the interrupt-flags and jump to a defined functional code block at its leisure. The watchdog timer running on a separated clock source is intended as a timeout mechanism to unfreeze the MCU from being stuck in a forever loop. The watchdog timer can both reset the MCU and issue an interrupt for most of the newer AVR chips. The watchdog timer can also wake up a MCU from deep sleep battery saving modes and of course be utilized as a less flexible general purpose timer, not a bad idea for AVR with only a general purpose timer unit such as an Attiny13A. Timer can also generate waveforms on output-compare I/O pins, or the so-called PWM pins, but it can also do PFM and more. Timer can also accept waveform as input on its ICP or input capture pin, although only good for relatively slower waveform range, but ICP can also be triggered with software and so it can measure time in a more flexible and accurate manner than a software based counter can. Normally the timer is powered by the CPU clock with a divider from 1-1024, but it can also be clocked-in asynchronously with an external 32k crystal oscillator which essentially makes it a RTC, real time clock. Asynchronous timer interrupt is also one of few possible wake-up sources for the MCU in deep sleep mode.
ADC, analog to digital converter converts the voltage level on an ADC pin to a digital number. A simple concept, yet there is a lot to get right before an accurate and responsive reading can be acquired. In the hardware level, the ADC pin should avoid noise neighbor traces and for the AVCC pin or the power supply pin for the ADC unit, it should be inductor coupled. Any high frequency noise on the ADC pin should be isolated with a low-pass filter, and one stage or first order is sufficient most of the time. The input impedance for the ADC is about 10k ohm, and so to read a higher resistance source, either an external voltage follower circuit is required or the sampling time of the ADC is reduced on the software level. While there are several ADC pins and even more internal analog sources, there is only one ADC unit existing inside a MCU, so the ADC input sources are selected using a multiplexing circuit, meaning only one ADC source can be read at a time. The approximation circuit requires 13 ADC clock cycles for one reading and the fastest ADC clock is divided by 2 from the CPU clock source. This translates into a technically maximum 8000000/2/13= 307k readings every second for an 8 Mhz CPU, much less in reality BTW. While the internal bandgap voltage reference is accurate enough, for an ultimate accurate ADC result an external voltage source can be applied to the AREF pin, but this not only increases the circuit complexity but also limits the flexibility on the software side, namely the voltage reference is fixed to the external one, and of course configuring the ADC incorrectly will short-out the ADC unit in this case.
For serial communication, there are 3 main peripherals for most AVR MCU: USART, SPI, and I2C. A user defined serial protocol can also be easily implemented on a few specific hardware pins and even by hacking the above 3 serial peripherals. For newer AVR, USART hardware can naturally be switched to SPI mode. A full software based serial protocol should only be considered for simple serial communication. Both SPI and I2C are synchronous, meaning data both shifting in/out are on the same shared clock line. USART can be both synchronous and asynchronous. In asynchronous mode, data are being shifted in/out on an agreed speed or baud rate, so clock accuracy is more important and both receiver and transmitter are required to be configured with the same baud rate; This opens up a cool hacking idea about interfacing multiplex devices on the same serial bus using different baud rate, but this is already possible using digital addressing filter as implemented with AVR's Multi-processor communication mode. The simplicity of asynchronous serial means that a one way link can be easily established with different communication channels: radio, light, infrared, sound, and etc. It is also possible to operate UART over one line, but this limits it to a much slower half-duplex mode and requires more complex software.
The hardware for the AVR MCU is just that, and while coding for the chip is almost as simple, in reality it can be complicated from getting started to actually writing the code to compiling the code to uploading it. The programming choices for coding an AVR are c or assembly. While assembly is not very practical, it should be well understood because 1, understanding assembly is equivalent to understanding the MCU and 2, assembly is very easy. GNU has a relatively good support for AVR in both c and c+, but with limited functionalities and for a good reason. The AVR toolchain contains everything to develop a final working code. The process is to convert the source code into the machine code and upload it into the MCU. For most time, the final format would be an Intel hex file, and most of the AVR MCUs support code uploading via their SPI hardware, even if that particular chip doesn't have a SPI peripheral such as most of the Attiny series. The actual uploading requires a hardware called a programmer and there are a lot of such available out there. The actual uploading also requires a uploading software and of course the driver for that particular programmer. AVRdude is one of the most popular free uploading software and supports a lot of programmers. It is relatively simple to create a custom made one for the experienced. SPI is a very simple interface, but most computers do not natively support that, so a USB to SPI chip is required. The procedure to uploading via SPI to the AVR is also simple and well documented in every AVR datasheet. The driver is provided by the chip maker already and what needs to be done is to translate the uploading procedures into the uploading software. Majority of the programmers and uploading software available today are outdated and very slow, but for uploading 8k bytes of data, they are sufficient enough, but since an AVR can be reprogrammed at least 10000 times, a faster programmer is always welcomed.
Object-oriented programming is a great and modern technique to build an application when you have a lot of processing power, but not necessarily true for a MCU with not only limited processing power, but also memory. Therefore, there is no brainer on how to code a MCU, using the simplest and most direct approach. Trying to conceptualize hardware in-order to make it easier for someone to control it may be counter-productive because hardware doesn't exist in a vacuum or freezing in one moment in time. Digital hardware is actually a series of intricate cascading pathways connecting each other by countless different logical gates. The direct path is the most optimized path, but this requires finding the path. In this case, understanding how the hardware actually works.
Full clarity is the prerequisite when extremely high quality code is the desired produce. When I write a few lines of code, I generally can see how they could be compiled into machine code. The simple tasks that are at the CPU level may not be very intuitive when coding using higher C programming level, but understanding what the CPU can do, and then try to write code to match for that is crucial for the compiler to convert them into high quality machine code. There are compromises on which paths to emphasize on, either codes that are optimized for speed or size, and most of the time they are hand in hand together. For example, functions that take a runtime variable means it's flexible, but slower. On the other hand, functions that require compile time constants may not need to load and push any variable at all since the variables are embedded into the CPU instructions.
Automation is probably the most useful field that a MCU can be deployed on. Don't confuse automation with artificial intelligence (AI), even though they seem similar. They are very different. Automation is where you have few pre-defined procedures for few known conditions. AI is where you have to pick the best procedures from a huge amount of procedures for an unknown condition. Real AI is very computing intensive, while automation is simple and dumb repetitive code running in a loop which a slow MCU is good at doing.
Fundamentally speaking, a MCU is a general purpose IC, where as a controller IC is designed for controlling a specific device, and the most important feature of a MCU is its large number of programmable input and output pins. This not only allows it to see and interact with the surrounding environment directly, but also means it can be turned into any specialized chips and when building a system, highly simplify its design and process.
In short, the MCU is both universal and timeless. It's designed to last for 100 years. You can't say that to the chips inside your computer now. As software designers add more "objects" into their applications, your yesterday's old and capable computer simply can't handle the new world anymore. Therefore since a MCU is so basic and simple, it's obsolete proof.