Introduction to proteins

Proteins are large linear polymers make up of hundreds or even thousands of L-α-amino acids linked in series by covalent bonds called peptide bonds.
They are a large and diverse class of biomolecules; in fact, they represent more than 50% of the dry weight of cells, where they are the most abundant macromolecules, ∼42 million proteins per cell, with thousands of different proteins per cell.
Because proteins are generally easier to isolate and characterize than lipids, nucleic acids, and polysaccharides, which are complex carbohydrates, early biochemical research was concerned with the their study, and can be traced back to studies on the chemical composition of albumins conducted by Jöns Jacob Berzelius and Gerardus Johannes Mulder in 1839. Conversely, the role of nucleic acids in the transmission and expression of genetic information come to light in the 1940s, and their catalytic role only in the 1980s, whereas the role of lipids in biological membranes in the 1960s.
Example of three-dimensional structure of proteinsProteins are very versatile macromolecules and play a central role in essentially all cell structures and functions, such as oxygen transport, immune response, growth and differentiation control, or movement. Each protein is tailored to its biological role, and the instruction for its synthesis, and hence for its role, is stored in the gene that codes for the protein. Hence, proteins are the vehicle through which genetic information is expressed.

CONTENTS

From amino acids to primary structure

Like other biological macromolecules, such as nucleic acids and polysaccharides, proteins are built of many small organic molecules or monomer units, namely, L-α-amino acids.
L-α-Amino acids are bifunctional organic compounds containing both carboxyl group  and an amino group attached to a central carbon atom, also known as the α-carbon. Each amino acid is characterized by a side chain, known as an R group, that, like the carboxyl group and the amino group, is attached to the α carbon. R groups determines the properties of the amino acid, having a variety of sizes, charge, shapes, and reactivities. For example, there are thiols, alcohols, carboxylic acids, different basic groups and carboxamides in the R-groups of the different amino acids. From viruses to humans, about 20 different L-α-amino acids are incorporated into proteins, although occasionally D-α-amino acids are incorporated, too.
By joining amino acids together in a characteristic linear sequence, cells are able to produce proteins with strikingly different properties and activities, from enzymes to the lens protein of the eye, or mushroom poisons. All these properties and activities derive from the amino acid sequence or polypeptide chain, also called primary structure, which is unique and genetically determined.
During protein synthesis, the carboxyl group of one amino acid is covalently linked to the amino group of the incoming amino acid via condensation, a reaction catalyzed by specific enzymes during which a molecule of water is released, to form a peptide bond. A linear polypeptide chain is formed by end-to-end joining of many amino acids.

Connection between structure and function of proteins

The properties and functions of proteins are largely determined by their three-dimensional structure, or conformation.
Polypeptide chain is not a stiff structure and spontaneously folds up into a distinct three-dimensional structure that is largely determined by its amino acid sequence.
The folding of the primary structure into specific three-dimensional structures, such as secondary structures, supersecondary structures, domains, and, finally, tertiary structures causes the transition from the one-dimensional world of the polypeptide chain to the three-dimensional world of proteins. The tertiary structure of the proteins is called the native structure and is their biologically active form.
In addition, two or more polypeptide chains can associate to form a higher structure called quaternary structure, characteristic for example of hemoglobin.
Finally, proteins can interact to form “macromolecular machines” capable of carrying out functions that proteins would not be able to accomplish alone. An example are the multienzyme complexes, such as the pyruvate dehydrogenase complex.
A method of protein classification bases on their biological function.
Two other classification methods base on their shape and chemical composition.

On the basis of the shape, proteins may be divided into:

  • fibrous proteins, such as collagen or elastin;
  • globular proteins, such as enzymes, or membrane transporters and receptors.

On the basis of chemical composition, they may be divided into:

  • simple proteins or homoproteins, made up of only amino acids, such as collagen or plasma albumin;
  • conjugated proteins, or heteroproteins, that contain a non-protein portion, such as glycoproteins like glycophorin, a membrane protein of red blood cells.

References

Berg J.M., Tymoczko J.L., and Stryer L. Biochemistry. 5th Edition. W. H. Freeman and Company, 2002

Garrett R.H., Grisham C.M. Biochemistry. 4th Edition. Brooks/Cole, Cengage Learning, 2010

Lodish H., Berk A., Zipursky S.L., et al. Molecular Cell Biology. 4th edition. New York: W. H. Freeman; 2000. Section 3.1, Hierarchical Structure of Proteins. Available from: https://www.ncbi.nlm.nih.gov/books/NBK21475/

Kessel A., Ben-Tal N. Introduction to proteins: structure, function, and motion. CRC Press, 2011 doi:10.1002/cbic.201100254

Moran L.A., Horton H.R., Scrimgeour K.G., Perry M.D. Principles of Biochemistry. 5th Edition. Pearson, 2012

Nelson D.L., Cox M.M. Lehninger. Principles of biochemistry. 6th Edition. W.H. Freeman and Company, 2012

Voet D. and Voet J.D. Biochemistry. 4th Edition. John Wiley J. & Sons, Inc. 2011