what is a computer?
- any device, entity, or object that can perform computations
- need not be electronic in nature
processor/microprocessor/central processing unit (CPU)
- core unit that does the actual computations.
- only does a few things, such as:
- jump to a particular location in RAM
- read a number from RAM
- save a number to RAM
- add two numbers together
- compare two numbers to see which is larger
- send a number to a particular device (e.g. a monitor)
- receive a number from a particular device (e.g. a keyboard)
- the processors is often called "the brains" of the computer.
- processors are grouped in families that all share similar characteristics
- usually each family has a specific variety of machine code that it understands
- all instructions to the processor as to what it should do must be specified as a sequence of these machine codes
- watch video: See How the CPU Works In One Lesson
- speed of processor's electric pulses (oscillating at a certain frequency to keep time).
- measured in Hertz, (i.e. cycles per second)
- the processor can only perform a fixed number of computations with each pulse
- the pulses that the clock sends to the processor are typically 5 Volts or less.
- see more on this topic
- the language written by the creator of the program (usually a human).
- written as plain text
- source code must usually be translated into some other type of code that the processor can natively understand, called machine code, before it can be executed.
- language understood by processor
- each family of processors has its own variety of machine code
- instructions in machine code can be directly executed by the processor it is designed for
- lowest level of code that the processor knows how to execute (i.e. how to "run")
- binary code: all 0's and 1's
- see more on number systems
- assembly language is a simple set of mnemonics, such as 'add' or 'sub' for subtract.
- each mnemonic is a shorthand code that directly maps to one of the binary machine code instructions available on the processor
- for every machine code instruction, there is a mnemonic using alphabetic characters that can be written instead
- assembling is the process of translating assembly language mnemonic codes into machine code for a specific processor family
- an assembler is the software that assembles
- once assembled, the machine code is executable on the processors for which it was designed
- is translating an entire program from one programming languages into another
- this term is most often used to refer to the process of translating code in a high-level programming language directly to machine code
- but also used to refer to the process of translating high-level programming code to an intermediate-level language, such as byte code.
- if compiled directly to machine code, that machine code is then "executable", meaning it can be run on the processor, if desired
- another program, called an executor can then be used to execute the compiled machine code on the processor, if desired.
- if compiled to an intermediate-level language instead of machine code, another compiler or interpreter must then be used to translate that intermediate-level language to machine code before it can be executed.... there are reasons why this might be desirable which we will talk about.
- at a simplified level, an interpreter translates a single statement of source code at a time from one higher-level programming language into machine code and immediately executing that machine code on the processor
- the code is being translated "on the fly" into machine code and immediately executed
- contrast this with compiling, which involves bulk analysis and translation of an entire source code base to machine code, but does not execute the machine code
the reality of compiling vs interpreting
- read this short explanation by a user on stackoverflow.com for a bit more of the reality of modern-day interpreters, which are more complex than my simplified explanation
Many real world high-level programming language implementations use one or both of compiling and interpreting. Here are a few examples:
- high-level C code is compiled directly to machine code for a specific processor family
- that compiled machine code can then be executed on the processors for which it was compiled
- high-level Java code is compiled to intermediate-level bytecode
- bytecode can then be interpreted into machine code by any Java Virtual Machine
- high-level Python code is compiled to intermediate-level bytecode
- bytecode can then be interpreted into machine code by any Python Virtual Machine
- Python and Java both compile to their own versions of bytecode
- Bytecode is a form of code that is interpreted natively by the corresponding virtual machine.
- Bytecode is called such because each instruction in these languages is one byte (8 bits) of data
- The advantage of bytecode is that all processors for which a Python Virtual Machine or Java Virtual Machine have been created can execute their respective bytecodes, whereas typically machine code differs from processor to processor. So any given machine code can only run on the processor family for which it was created, whereas bytecode can run on any computer that has a virtual machine installed.
- this allows python and java to be "write once, run anywhere" languages.
- see this explanation by a Python core developer for a more detailed description of what bytecode is (starting at 2:10 in the video)
- Java was one of the early languages that relied on the bytecode compiler/interpreter paradigm
- in Java, any family of processors that has had a "Java Virtual Machine" (JVM) designed for it can run the Java bytecode
- the people who make Java have made JVMs for most popular processor families
- a programmer writes high-level Java code, this is compiled to bytecode, this bytecode is saved onto whatever computer the user wants it to run the program on, and the JVM interprets the bytecode to the appropriate machine code for that processor family
- so Java is a "write once, run anywhere" sort of language
Documenting code is important for readability and maintenance of that code. Most languages have common conventions for how developers leave notes and document their code.
- See Python docstring conventions
- All programs must be documented following these conventions
- See Documenting source code using Javadoc
- All programs must be documented following these conventions
reminder not to copy code
- Copying of code in most languages is easily caught and the consequences are dire: for example...