Chapter 4
Chemistry of Oil
Petroleum chemistry explained: paraffins, naphthenes, aromatics, olefins, and the molecular structures that determine fuel quality.
The Building Blocks: Carbon and Hydrogen
Crude oil chemistry is the basis for refinery processes, the reason different crudes and products trade at different values, and how petrochemicals are made. This chapter assumes no chemistry background. A molecule is two or more atoms held together by chemical bonds. Water contains two hydrogen (H) atoms and one oxygen (O) atom, hence the familiar H2O.
Crude oil typically consists of 84 to 87 percent carbon, 11 to 14 percent hydrogen, 0 to 6 percent sulfur, and less than 1 percent nitrogen, oxygen, metals, and salts by weight. Carbon and hydrogen combine to form hydrocarbons, the molecules that make oil so valuable. A barrel of crude can contain thousands of distinct hydrocarbons arranged in many different ways. The reason oil matters economically is simple: hydrocarbons release a very large amount of energy when combined with oxygen during combustion.

Valency and bonds
Each carbon atom has a valency of four: it wants to form four bonds with neighbouring atoms. Each hydrogen atom has a valency of one. When a carbon atom cannot find enough hydrogen partners, two carbons will share a double bond (or, rarely, a triple bond) to satisfy that need for four connections. Single bonds are stable. Double and triple bonds are weaker, more reactive, and the defining feature of the unsaturated molecules discussed below.
Carbon Count, Boiling Point, and Physical State
Refining separates crude into products based on carbon count, because molecules with fewer carbons boil at lower temperatures. At normal atmospheric pressure and room temperature, hydrocarbons with 1 to 4 carbons are typically gases, those with 5 to 24 carbons are liquids, and those with 25 or more carbons are solids. A heavy crude contains a higher fraction of long-carbon molecules; a light crude is skewed toward short ones.
Table 4-1: Crude oil fractions by carbon count
| Fraction | Carbons | State | Boiling range | Primary uses |
|---|---|---|---|---|
| Petroleum gases (methane, ethane, propane, butane) | 1 to 4 | Gas | Below 20 C | Heating, power, LPG, petrochemical feedstock |
| Light ends (naphtha, gasoline) | 5 to 11 | Liquid | 70 to 200 C | Gasoline, solvents, petrochemicals |
| Middle distillates (kerosene, gas oil) | 11 to 18 | Liquid | 200 to 300 C | Jet fuel, diesel, heating oil |
| Heavy gas oil, lube base | 18 to 25 | Liquid | 300 to 400 C | Lubricants, feed for cracking |
| Residual fuel, waxes | 20 to 35 | Liquid or solid | 350 to 500 C | Bunker fuel, candles, paraffin wax |
| Bitumen, coke | 35 and up | Solid | 500 C and up | Road paving, roofing, steelmaking fuel |
The Four Molecular Structures: PONA
Despite thousands of distinct hydrocarbons in crude, each belongs to one of only four structural families. Traders characterise a crude by its PONA ratio: Paraffinic, Olefinic, Naphthenic, Aromatic. A refinery that knows the PONA mix of its feedstock has a good idea of what its product slate will look like before a single molecule touches a distillation tower.
Paraffins (alkanes)
Paraffins follow the general formula CnH2n+2. All bonds are single, so paraffins are saturatedand chemically stable. They come in two shapes: straight-chain (called “normal”, written with an n- prefix) and branched (called isomers, written with an iso- or i- prefix). A normal and iso molecule can share the same chemical formula yet behave very differently, because the arrangement of atoms changes their physical properties.
The canonical example is octane (C8H18). Normal-octane is a straight chain of 8 carbons and has a research octane number of roughly 20, meaning it knocks badly in a spark-ignition engine. Iso-octane (technically 2,2,4-trimethylpentane) is a branched isomer with the same formula but it resists knock so well that it was chosen as the reference point at octane 100. Branched paraffins dominate high-octane gasoline pools for exactly this reason. Very long-chain paraffins form petroleum waxes at room temperature.




Naphthenes (cycloalkanes)
Naphthenes are saturated rings of carbon. A single-ring naphthene follows the formula CnH2n. The workhorse example is cyclohexane, a six-carbon ring that is abundant in most crudes and an important precursor to nylon. Because naphthenes are saturated, they behave much like paraffins: chemically stable, good blending components, and easy for refinery units to handle. Paraffinic and naphthenic crudes are often lumped together as “paraffinic” by traders, against the more reactive aromatic crudes.

Aromatics
Aromatic hydrocarbons are ring structures built around at least one benzene ring: six carbons with three alternating double bonds. Because of those double bonds, aromatics are unsaturated and more reactive than paraffins or naphthenes. That reactivity is exactly what makes them valuable as petrochemical feedstocks; it is also why benzene, a known carcinogen, is tightly capped in gasoline. The key aromatics are benzene, toluene, and the three xylene isomers, grouped together as BTX.
Molecules with two or more fused benzene rings are called Polycyclic Aromatic Hydrocarbons (PAH). Naphthalene, with two fused rings, is the simplest. At the extreme heavy end sit asphaltenes: very large PAH molecules, often with more than 70 carbon atoms, heavy branches, and heteroatoms. Asphaltenes absorb visible light, which is why crude oil is black, and they dominate the residue fraction of heavy sour crudes. They can also clog pipelines in cold weather and produce undesirable shot coke in a coker.


Olefins (alkenes)
Olefins are aliphatic hydrocarbons with at least one carbon-carbon double bond (mono-olefins), two double bonds (diolefins or dienes), or a triple bond (alkynes). They are rarely present in crude oil straight out of the reservoir because they are too reactive to survive geological time. Almost every olefin in the oil system was manufactured at a refinery, typically by thermal or catalytic cracking of paraffins. Ethylene and propylene, the two most important olefins, are the foundation of the global petrochemical and plastics industry.


Typical PONA by Crude Grade
PONA mix shifts systematically with crude quality. A light sweet grade like WTI or Bonny Light is paraffin-rich, which is why it yields so much gasoline-range material with relatively little cracking. A medium sour grade like Arab Light sits in the middle of the spectrum. A heavy sour grade like Maya or Arab Heavy is aromatic-rich and resin-laden, with a long tail of asphaltenes and very little natural naphtha. The table below shows indicative PONA-plus-asphaltene ranges by crude class.
Table 4-2: Indicative PONA composition by crude grade (percent by weight of whole crude)
| Structure | Light sweet | Medium | Heavy sour |
|---|---|---|---|
| Paraffins | 45 to 60 | 30 to 45 | 15 to 25 |
| Naphthenes | 25 to 35 | 25 to 35 | 20 to 30 |
| Aromatics | 10 to 20 | 20 to 30 | 30 to 45 |
| Resins and asphaltenes | Below 2 | 3 to 8 | 10 to 20 |
| Olefins (in reservoir) | Trace | Trace | Trace |
Ranges vary by field and by how an assay laboratory draws the boundary between heavy aromatics and resins, so treat these as orders of magnitude rather than hard numbers. The shape of the story, however, is robust: paraffin fraction falls and aromatic plus asphaltene fraction rises as you move from light sweet to heavy sour.
Saturated vs Unsaturated
A molecule is saturated when every carbon is bonded to four other atoms through single bonds, and unsaturated when one or more double or triple bonds are present. Paraffins and naphthenes are saturated and therefore stable; aromatics and olefins are unsaturated and therefore reactive. Fuels prize stability (you do not want gasoline polymerising in the tank), while petrochemical feedstocks prize reactivity (that is how you build polymers in the first place). The entire economic logic of a refinery, splitting molecules into a stable fuel pool and a reactive chemical pool, follows from this single distinction.
Heteroatoms: Sulfur, Nitrogen, Oxygen, Metals
Not every atom in crude is carbon or hydrogen. The other elements, collectively called heteroatoms, are usually impurities but they drive a surprising amount of refinery economics.
Sulfur appears as hydrogen sulfide (H2S), as mercaptans (thiols), and bound inside thiophene rings. H2S is acutely toxic and corrodes steel; mercaptans carry the rotten-egg odour that gives sour crude its name and are the reason odorant is added to household natural gas. Thiophenes are the hardest to remove because the sulfur is locked inside an aromatic ring. Nitrogen shows up in pyridines and pyrroles and poisons the catalysts used in cracking and reforming, which is why high-nitrogen crudes carry a discount. Oxygen appears mainly as naphthenic acids (measured by Total Acid Number) and phenols; high-TAN crudes can corrode carbon-steel piping above 220 C. Metals, chiefly vanadium and nickel, sit at the centre of porphyrin rings and end up in the residue. Even trace amounts poison FCC catalysts, so heavy sour crudes with high metals are restricted to refineries with cokers that can reject metals into petroleum coke.
Table 4-3: Typical heteroatom ranges in whole crude
| Element | Common form | Typical range | Why it matters |
|---|---|---|---|
| Sulfur | H2S, mercaptans, thiophenes | 0.05 to 5 percent by weight | SOx emissions, corrosion, catalyst poisoning |
| Nitrogen | Pyridines, pyrroles | 0.05 to 0.8 percent by weight | NOx precursor, poisons cracking catalysts |
| Oxygen | Naphthenic acids, phenols | 0.05 to 1.5 percent by weight | Corrosion above 220 C (high-TAN crudes) |
| Vanadium | Vanadyl porphyrins | 1 to 1,200 ppm | Poisons FCC catalyst, ends up in coke |
| Nickel | Nickel porphyrins | 1 to 150 ppm | Same as vanadium, less severe |
Combustion Chemistry
The reason hydrocarbons matter is that their oxidation reaction is strongly exothermic. For methane, the simplest case:
CH4 + 2 O2 → CO2 + 2 H2O + heat
For iso-octane, the reference fuel for gasoline, the balanced reaction is:
C8H18 + 12.5 O2 → 8 CO2 + 9 H2O + heat
From that equation you can derive the stoichiometric air-fuel ratio: the mass of air that exactly consumes a unit mass of fuel with nothing left over. For iso-octane that ratio is roughly 14.7 to 1 by weight, the famous number engine control units target in closed-loop operation. Real combustion is messier. If the reaction runs cool, or oxygen is short, some carbon exits as carbon monoxide (CO), unburnt hydrocarbons, or soot. Air is mostly nitrogen, so at the peak flame temperatures inside an engine cylinder some of that nitrogen oxidises to NOx. Any sulfur in the fuel oxidises to SOx. NOx and SOx are the two pollutants that drive most downstream environmental regulation. Chapter 15 (Environmental) picks up their story.
Octane and Cetane: Measuring Fuel Quality
Spark-ignition engines want a fuel that resists auto-ignition (knock). The octane scale was built by defining n-heptane as 0 and iso-octane as 100, then rating any other fuel against a matching blend of the two. Some molecules beat iso-octane outright and score above 100. Compression-ignition (diesel) engines want the opposite: a fuel that auto-ignites quickly under compression, which is measured by the cetane number. Here n-cetane (n-hexadecane) is defined as 100 and alpha-methylnaphthalene as 0.
Table 4-4: Indicative octane and cetane numbers for reference molecules
| Molecule | Research octane | Cetane number |
|---|---|---|
| n-Heptane | 0 (reference) | 56 |
| Iso-octane (2,2,4-trimethylpentane) | 100 (reference) | 15 |
| Toluene | 111 | Low (not a diesel fuel) |
| Benzene | 100 | Low |
| Methanol | 107 | Very low |
| Ethanol | 108 | Very low |
| n-Cetane (n-hexadecane) | Very low | 100 (reference) |
| alpha-Methylnaphthalene | High | 0 (reference) |
| Typical finished gasoline | 87 to 95 | Not applicable |
| Typical finished diesel | Not applicable | 40 to 55 |
Figure 4-12: Research Octane Number (RON) of Selected Fuels
The octane scale is defined by two reference fuels: n-heptane (RON 0) and iso-octane (RON 100). US pump ratings use the Anti-Knock Index (AKI), which averages RON and MON and is typically 4 to 6 points below RON. Source: ASTM D2699.
Notice the pattern: aromatics (toluene, benzene) and branched paraffins (iso-octane) score well on octane but poorly on cetane. Straight-chain paraffins (n-heptane, n-cetane) do the opposite. A molecule that is good for a gasoline engine is almost by definition bad for a diesel engine, which is why refineries cannot simply pour one finished stream into both pools.
Cracking and Combining
Basic distillation leaves a refinery at the mercy of whatever carbon-count distribution nature handed it. The real power of modern refining lies in using heat, pressure, and catalysts to rearrange molecules. Thermal cracking heats long paraffins until carbon-carbon bonds snap, producing shorter paraffins plus olefins. In shorthand:
C16H34 → C8H18 + C8H16
A sixteen-carbon wax molecule becomes an octane paraffin plus an octene olefin. Catalytic cracking (the FCC unit) does something similar at lower temperatures using a zeolite catalyst that favours branched, higher-octane products. Alkylation runs the process in reverse: isobutane plus a small olefin combine across an acid catalyst to yield a branched octane-range paraffin (alkylate) that is prized in gasoline blending.
A refinery can lift gasoline yield from roughly 20 percent of a barrel under basic distillation to 55 percent or more by cracking and combining hydrocarbons. This flexibility is what makes a complex refinery so much more valuable than a simple one.
Why Chemistry Drives Pricing
Every concept in this chapter comes back to one point. Crude grades trade at different prices because their molecular mixes yield different product slates. A light sweet crude is paraffin-rich and low in heteroatoms, so a simple refinery can turn it into high-value gasoline and distillate with minimal treatment. A heavy sour crude is aromatic- and asphaltene-rich with meaningful sulfur, nitrogen, and metals, so it needs a coker, a hydrocracker, and extensive hydrotreating before the same products can be made. Those extra units cost capital and hydrogen, and that cost is exactly what shows up in the light-heavy and sweet-sour price differentials discussed in Chapter 17 (Oil Prices). PONA, heteroatoms, and bond chemistry are the foundation for every price spread in the oil complex.
Chapter 7 (Refining) picks up the process units (FCC, hydrocracker, coker, reformer, alkylation) that turn PONA theory into gasoline and diesel. Chapter 10 (Petrochemicals) returns to olefins, aromatics, and BTX as petrochemical feedstocks. Chapter 15 (Environmental) follows NOx, SOx, and CO2 from combustion through to regulation.
The above was updated in 2026. For the full original 2009 chapter, download the 1st edition 2009 PDF.