5 files changed, 350 insertions, 11 deletions
diff --git a/notes/algorithms.tex b/notes/algorithms.tex
index 581a5e5..c745b72 100644
--- a/notes/algorithms.tex
+++ b/notes/algorithms.tex
@@ -84,3 +84,23 @@ fewer comparisons will be made.
 \subsection{Bubble sort}
 
 \subsection{Merge sort}
+
+\section{Programming languages}
+
+Programming languages are useful to humans in writing algorithms for
+computers to run. There are \textit{high level languages} (such as
+Python, Haskell, and C) which are easier for humans to write and
+understand, \textit{low level languages} (assembly code) which are
+human readable but directly represent machine code, and machine code
+which is not readable by humans as it is binary code but is the only
+information the computer actually understands and runs.
+
+Some languages (such as C) are \textit{compiled}. This involves the
+entire \textit{source code} being turned into a machine code file by
+the compiler. Other languages (such as Python) are
+\textit{interpreted}. Here, an interpreter turns a line of code into
+machine code, runs it and then moves onto the next line. The benefit
+of a compiler is that once compiled, the machine code file is very
+fast to run. However, an interpreter offers easier development as
+there are no long compile times but it is slower than code that is
+already compiled.
diff --git a/notes/cyber.tex b/notes/cyber.tex
new file mode 100644
index 0000000..efa62ff
--- /dev/null
+++ b/notes/cyber.tex
@@ -0,0 +1,76 @@
+\chapter{Cyber security}
+
+Cyber security is the study of the relation between computers,
+networks, and malicious threats and attacks that they are vulnerable
+to.
+
+\section{Threats}
+
+\begin{itemize}
+    \item \textit{Social engineering} involves exploiting people
+        directly for access or information. \textit{Blagging} involves
+        obtaining information through deception or impersonation, such
+        as calling someone whilst posing as a friend.
+        \textit{Phishing} is posing as a legitimate organisation to
+        obtain personal information, generally through email.
+        \textit{Pharming} involves a bogus website that imitates a
+        legitimate one. \textit{Shouldering} involves watching
+        somebody enter their personal information.
+    \item \textit{Malicious code} is code written to do bad. A
+        \textit{virus} does damage on a computer and spreads itself on
+        a user's device over the internet. \textit{Spyware} is
+        software that monitors, logs, and sends information to the
+        spy. For example, a keylogger may record every key a user
+        presses and send it to the spy so that information such as a
+        password can be extracted. \textit{Adware} is a program that
+        is designed to show the user advertisement and a
+        \textit{Trojan} is any malware that poses as a legitimate
+        software.
+    \item \textit{Weak passwords} or \textit{misconfigured access
+        rights} may allow an attacker easy access to unauthorised
+        data. Access rights would normally restrict certain
+        information from certain users.
+    \item \textit{Removable media} such as a DVD or USB flash drive is
+        a vector by which malware can easily spread, particularly when
+        distributed, such as at a public event.
+    \item Unpatched or outdated software may contain vulnerabilities,
+        as well as normal software with recently discovered
+        vulnerabilities which an attacker could exploit.
+\end{itemize}
+
+\section{Threat prevention}
+
+\subsection{MAC Address filtering}
+
+A \textit{MAC Address} is unique to each device. Filtering MAC
+Addresses could mean only allowing authorised devices to connect to
+the network (\textit{whitelisting}) or blocking certain devices from a
+network (\textit{blacklisting}). However, this is bypassable through
+MAC address \textit{spoofing}, where a device can appear to have a MAC
+address other than its own.
+
+\subsection{Firewall}
+
+A firewall blocks internet activity. This may be blocking access to
+certain sites, or preventing external activity from potential
+attackers.
+
+\subsection{Authentication}
+
+Authentication is the validation of identity through credentials. The
+most common form of this is through a username and password. It can
+also be through physical objects such as cards (such as credit card)
+and through biometric methods such as fingerprints.
+
+CAPTCHA (tests that determine if a user is a human, such as by typing
+in a word in strange font) and e-mail verification (where the user
+must respond to an e-mail only they could have received) can also be
+used as authentication and to ensure that the user is human and not an
+automated attack.
+
+\subsection{Encryption}
+
+Encrypted data is encoded in such a way that only the sender and
+recipient and sometimes only recipient can decode the data and read
+the information. To anyone else (such as an eavesdropper), the data is
+meaningless.
diff --git a/notes/data-rep.tex b/notes/data-rep.tex
new file mode 100644
index 0000000..f2f5ef4
--- /dev/null
+++ b/notes/data-rep.tex
@@ -0,0 +1,195 @@
+\chapter{Data representation}
+
+\section{Metadata}
+
+Metadata means data about data. It is the information stored in a file
+that is not part of the main information, but instead important
+properties and data of the file, such as the author name of a PDF
+document. Although we do not take it into account when calculating
+file size, it is important to realise that in the real world, it would
+be there.
+
+\section{Character encoding}
+
+The \textit{American Standard Code for Information Interchange}
+(ASCII) is a character encoding standard. It is comprised of
+a \textit{character set} which contains characters and numerical
+representations of each of them, so that they can be processed and
+stored by a computer. Standard ASCII uses 7 bits to represent each
+character and can thusly store 128 characters (127 characters and 1
+null character). The letter A in ASCII is character 64 ($0100000_2$),
+allowing for easier human representation of the rest of the alphabet,
+which increments from 64.
+
+Extended ASCII uses 8 bits to store each character and thus contains
+256 characters (255 characters and 1 null character), allowing for
+more than the limited alphabet standard ASCII contains by including
+more symbols and some accented characters. This however, is quite
+limited in comparison to Unicode which is a significantly larger set
+which allows for character encoding in many languages and many symbols
+such as emoji characters. Unicode is also compatible with many more
+devices. The drawback to extended ASCII and even moreso to Unicode is
+that they require more space to encode characters.
+
+\section{Images}
+
+A \textit{bitmap} (such as jpg, png) image is made up of rows of
+pixels, each with a colour value.  The resolution of the image, the
+$\mbox{width}\times\mbox{height}$ of pixels, determines the detail the
+image may contain. The colour depth is the number of bits of
+information that each colour is encoded with.  As with 7 bit ASCII,
+the number of colours that a pixel can possibly have is determined by:
+
+\begin{equation}
+    \mbox{\textit{Number of colours}} = 2^{\mbox{\footnotesize Bit\ depth}}
+\end{equation}
+
+For example a 4 bit depth would mean that each pixel could store one
+of 16 colours. As this is all of the information an image file stores,
+we can deduce that:
+
+\begin{equation}
+    \mbox{\textit{File size}} =
+    \mbox{width}\times\mbox{height}\times\mbox{bit depth}
+\end{equation}
+
+A \textit{vector} image does not use pixels, but instead mathematical
+descriptions of shapes to form an image. This means that whilst
+zooming into a bitmap image we lose quality as resolution decreases,
+this is not the case for vector images. However, vector images could
+not represent a photograph and are generally used for design,
+animation, or games.
+
+\section{Sound}
+
+A \textit{microphone} transforms vibrations in the air (sound) into an
+analogue signal (\textit{voltage}). Digital circuitry can take a
+\textit{sample} of this analogue signal at set intervals, thus
+producing a close representation of the original. The
+\textit{resolution} of a sound file is based on the number of possible
+values a single sample can have, much like colour depth, where the
+resolution is measured in bits. The \textit{sample rate} (measured in
+Hertz (Hz), meaning `per second') is the number of samples taken per
+second. By multiplying the sample rate by the time of the recording in
+seconds, we have the number of samples taken. Then by multiplying this
+by the number of bits each sample requires (the resolution) we have
+the file size.
+
+\begin{equation}
+    \mbox{\textit{File size}} =
+    \mbox{sample rate}\times\mbox{resolution}\times\mbox{time in
+    seconds}
+\end{equation}
+
+\section{Data compression}
+
+If files such as image, sound, or perhaps video were stored
+uncompressed then they would take a lot of space on hard drives and
+require longer to transfer over a network. Therefore, data compression
+is generally used which involves using algorithms to reduce the size
+of a file. \textit{Lossless compression} simply reduces the size of a
+file and we will examine two example of this. \textit{Lossy
+compression}
+\footnote{\url{https://en.wikipedia.org/wiki/Data_compression}}
+involves removing data such as removing quiet tones from an audio
+file.
+
+\subsection{Run length encoding}
+
+Run length encoding is a form of compression where repeated values are
+compressed. For example, take the binary
+\begin{align*}
+    111110000001110000
+\end{align*}
+by compressing the runs where a zero or a one is repeated, it can
+become
+\begin{align*}
+    (5,1),(6,0),(3,1)(4,0)
+\end{align*}
+
+\section{Huffman encoding}
+
+Huffman describes a method of encoding text. Suppose we have the text
+\begin{align*}
+    repetitive
+\end{align*}
+We want to create a Huffman tree by writing out each character and the
+number of times it occurs at the bottom. Then we join together the two
+lowest numbers and add their number of occurrences. We keep doing this
+until there is one number left. This ensures that each the letters
+that occur the most are closest to the top of the tree. Each branch is
+then labelled zero or one, depending on whether it goes left or right
+respectively. Each letter then has a new code, based on the path from
+the top of the tree to it.
+
+%% requires package \usepackage[edges]{forest}
+%\begin{forest}
+%    for tree={
+%        grow=south,
+%        s sep=7mm
+%    }
+%    [rpveit (10)
+%    [rpve (6)
+%    [e (3)]
+%    [rpv (3)
+%        [ rp(2)
+%        [r (1)]
+%        [p (1)]
+%        ]
+%        [v (1)]
+%        ]
+%        ]
+%        [it (4)
+%        [i (2)]
+%        [t (2)]
+%        ]
+%        ]
+%        ]
+%\end{forest}
+
+\begin{tikzpicture}
+    \node at (0,0) (top) {rpveit (10)};
+
+    \node at (-2, -1.5) (2a) {rpve (6)};
+    \node at (4, -3) (2b) {it (4)};
+
+    \node at (-1, -3) (3a) {rpv (3)};
+    \node at (-2, -4.5) (3b) {rp (2)};
+
+    \node at (-6, -6) (e) {e (1)};
+    \node at (-4, -6) (r) {r (1)};
+    \node at (-1, -6) (p) {p (1)};
+    \node at (1, -6) (v) {v (1)};
+    \node at (3, -6) (i) {i (2)};
+    \node at (5, -6) (t) {t (2)};
+
+
+
+    \draw (top) to node[above]{$0$} (2a);
+    \draw (3a) to node[above left]{$0$} (3b);
+
+    \draw (top) to node[above]{$1$} (2b);
+    \draw (2a) to node[above right]{$1$} (3a);
+
+    \draw (3b) to node[above]{$0$} (r);
+    \draw (2b) to node[above left]{$0$} (i);
+    \draw (3b) to node[above]{$0$} (r);
+    \draw (2a) to node[left]{$0$} (e);
+
+    \draw (3b) to node[above right]{$1$} (p);
+    \draw (2b) to node[above right]{$1$} (t);
+    \draw (3a) to node[above]{$1$} (v);
+\end{tikzpicture}
+Here, the letter e has the code $00$, the letter r has the code
+$0100$, the letter p has the code $0101$, and so on, as these are the
+paths from the top of the tree to the letter in question. If we were
+encoding with standard ASCII (7 bits per character) then the size of
+the sting would be $7\times 10 = 70\ bits$ (as there are 10
+characters).
+
+Using the Huffman tree we have generated we can encode $repetitive$ as 
+\begin{align*}
+    0100000101001110111001100
+\end{align*}
+This uses $25\ bits$, which is a $(70-25)\div(70)\times 100 \approx
+64.3\%$ saving.
diff --git a/notes/legal.tex b/notes/legal.tex
new file mode 100644
index 0000000..5b74b79
--- /dev/null
+++ b/notes/legal.tex
@@ -0,0 +1,50 @@
+\chapter{Impacts of computing}
+
+Computing and computer science has very clearly transformed our world
+in many ways. 
+
+\section{Legal}
+
+Computing has many legal consequences. The ways in which the law
+applies in the general world is very different from the internet,
+such as a new ability to distribute information. There are many laws
+which apply to the internet, but here are a few:
+
+\subsection{Data Protection Act (1998)}
+
+The Data Protection Act makes it a requirement for organisations to
+follow strict guidelines when storing people's sensitive or personal
+data. A modern law similar to this is the \textit{General Data
+Protection Regulation} (GDPR) (2016) is more up to date.
+
+\subsection{Computer Misuse Act (1990)}
+
+The Computer Misuse Act makes it it illegal to gain
+\textbf{unauthorised} access to a computer, or to use a computer in
+committing an offence.
+
+\subsection{Copyright, Designs and Patents Act (1988)}
+
+The Copyright, Design and Patents Act gives people or businesses
+ownership of their creative work, such that they have complete
+commercial rights to it, thereby making \textit{piracy} of content or
+software illegal.
+
+\section{Ethics}
+
+Ethical considerations involve thinking about the aspects of computing
+that are good or bad, or right or wrong. What is ethical or
+moral has no strict definition, although the law is based on ethics.
+An example of ethical considerations might be terms of service,
+offensive language and freedom of speech, the increasing use of phones
+with young people, or the sourcing of materials for use in electronics
+from places such as the Congo \footnote{Rory Stewart OBE: "Failed
+States - and How Not to Fix Them"; Yale University;
+\url{https://www.youtube.com/watch?v=zMXXJqvMdk4}}.  Thus ethics are
+open to interpretation and your own opinion.
+
+\section{Environment}
+
+Computing has many environmental impacts. From electricity usage to
+e-waste produced by the ongoing release of new devices, technology has
+an immense impact on our environment.
diff --git a/notes/paper.tex b/notes/paper.tex
index 48dce83..b28234a 100644
--- a/notes/paper.tex
+++ b/notes/paper.tex
@@ -52,7 +52,7 @@ top=0.5in, bottom=0.8in ]{geometry}
 
 \titleformat{\section}{\large}{}{0em}{}
 \titleformat{\subsection}{\bfseries}{}{0em}{}
-\titleformat{\chapter}{\vspace{-2cm}\tt\huge\itshape}{}{0em}{}
+\titleformat{\chapter}{\vspace{-2cm}\tt\huge\itshape}{\thechapter:}{5mm}{}
 
 \begin{document}
 \begin{titlepage}
@@ -75,6 +75,9 @@ top=0.5in, bottom=0.8in ]{geometry}
 \input{hardware.tex}
 \input{software.tex}
 \input{networks.tex}
+\input{legal.tex}
+\input{cyber.tex}
+\input{data-rep.tex}
 
 \chapter{Acknowledgements, about, and license}
 Notes for AQA GCSE Computer Science.\\
@@ -97,15 +100,10 @@ time on trying to answer questions and play with the ideas discussed.
 
 \vspace{2cm}
 
-\begin{myquote}
-
-    This work is licensed under the Creative Commons
-    Attribution-ShareAlike 4.0 International License. To view a copy
-    of this license, visit
-    \url{http://creativecommons.org/licenses/by-sa/4.0/} or send a
-    letter to Creative Commons, PO Box 1866, Mountain View, CA 94042,
-    USA.
-
-\end{myquote}
+This work is licensed under the Creative Commons
+Attribution-ShareAlike 4.0 International License. To view a copy of
+this license, visit
+\url{http://creativecommons.org/licenses/by-sa/4.0/} or send a letter
+to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
 
 \end{document}