Flavonoid biosynthesis pathway: genes and enzymes

The biosynthesis of flavonoids, probably the best characterized pathway of plant secondary metabolism, is part of the phenylpropanoid pathway that, in addition to flavonoids, leads to the formation of a wide range of phenolic compounds, such as hydroxycinnamic acids, stilbenes, lignans and lignins.
Flavonoid biosynthesis is linked to primary metabolism through both mitochondria- and plastid-derived molecules. Since it seems that most of the involved enzymes characterized to date operate in protein complexes located in the cell cytosol, these molecules must be exported to the cytoplasm to be used.
The end products are transported to different intracellular or extracellular locations, with flavonoids involved in pigmentation usually transported into the vacuoles.
The biosynthesis of this group of polyphenols requires one p-coumaroyl-CoA and three malonyl-CoA molecules as initial substrates.

Flavonoid biosynthesis pathway
Flavonoid Biosynthesis


Biosynthesis of p-coumaroyl-CoA

p-Coumaroyl-CoA is the pivotal branch-point metabolite in the phenylpropanoid pathway, being the precursor of a wide variety of phenolic compounds, both flavonoid and non-flavonoid polyphenols.
It is produced from phenylalanine via three reactions catalyzed by cytosolic enzymes collectively called group I or early-acting enzymes, in order of action:

  • phenylalanine ammonia lyase (EC;
  • trans-cinnamate 4-monooxygenase (EC:;
  • 4-coumarate-CoA ligase (EC
Biosynthesis of p-coumaroyl-CoA from phenylalanine
Biosynthesis of p-coumaroyl-CoA

They seems to be associated in a multienzyme complex anchored to the endoplasmic reticulum membrane. The anchoring is probably ensured by cinnamate 4-hydroxylase that inserts its N-terminal domain into the membrane of the endoplasmic reticulum itself. These complexes, referred to as “metabolons”, allow the product of a reaction to be channeled directly as substrate to the active site of the enzyme that catalyzes the consecutive reaction in the metabolic pathway.
With the exception of cinnamate 4-hydroxylase, the enzymes which act downstream of phenylalanine ammonia lyase are encoded by small gene families in all species analyzed so far.
The different isoenzymes show distinct temporal, tissue, and elicitor-induced patterns of expression. It seems, in fact, that each member of each family can be used mainly for the synthesis of a specific compound, thus acting as a control point for carbon flux among the metabolic pathways leading to lignan, lignin, and flavonoid biosynthesis.

Note: Phenylalanine is a product of the shikimic acid pathway, which converts simple precursors derived from carbohydrate metabolism, phosphoenolpyruvate and erythrose-4-phosphate, into the aromatic amino acids phenylalanine, tyrosine and tryptophan. Unlike plants and microorganisms, animals do not possess the shikimic acid pathway, and are not able to synthesize the three above-mentioned amino acids, which are therefore essential nutrients.

Phenylalanine ammonia lyase (PAL)

It is one of the most studied and best characterized enzymes of plant secondary metabolism. It requires no cofactors and catalyzes the reaction that links primary and secondary metabolism: the reversible deamination of phenylalanine to trans-cinnamic acid, with the release of nitrogen as ammonia and introduction of a trans double bond between carbon atoms 7 and 8 of the side chain.

Phenylalanine ⇄ trans-Cinnamic Acid + NH3

Therefore, it directs the flow of carbon from the shikimic acid pathway to the different branches of the phenylpropanoid pathway. The released ammonia is probably fixed in the reaction catalyzed by glutamine synthetase.
The enzyme from monocots is also able to act as tyrosine ammonia lyase (EC, converting tyrosine to p-coumaric acid directly, (therefore without the 4-hydroxylation step), but with a lower efficiency.
In all plant species investigated,  several copies of phenylalanine ammonia lyase gene are found, copies that probably respond differentially to internal and external stimuli. Indeed, gene transcription, and then enzyme activity, are under the control of both internal developmental and external environmental stimuli. Here are some examples that require increased enzyme activity.

  • The flowering.
  • The  production of lignin to strengthen the secondary wall of xylem cells.
  • The production of flower pigments that attract pollinators.
  • Pathogen infections, that require the production of phenylpropanoid phytoalexins, or exposure to UV rays.

trans-Cinnamate 4-monooxygenase

It belongs to the cytochrome P450 superfamily (EC 1.14.-.-), is a microsomal monooxygenase containing a heme cofactor, and dependent on both O2  and NADPH. It catalyzes the formation of p-coumaric acid  through the introduction of a hydroxyl group in 4-position of trans-cinnamic acid (this hydroxyl group is present in most flavonoids).

trans-Cinnamic Acid + NADPH + H+ + O2 ⇄ p-Coumaric Acid + NADP+ + H2O

This reaction is also part of the biosynthesis of hydroxycinnamic acids.
Increases in transcription rates and enzyme activity are observed in correlation with the synthesis of phytoalexins (in response to fungal infections), lignification as well as wounding.

4-Coumarate:CoA ligase (4CL)

With Mg2+ as a cofactor, it catalyzes the ATP-dependent activation of the carboxyl group of p-coumaric acid and other hydroxycinnamic acids, metabolically rather inert molecules, through the formation of the corresponding CoA-thioester.

p-Coumaric Acid + ATP + CoA ⇄ p-Coumaroyl-CoA + AMP + PPi

Generally, p-coumaric acid and caffeic acid are the preferred substrates, followed by ferulic acid and 5-hydroxyferulic acid, low activity against trans-cinnamic acid and none against sinapic acid. These CoA-thioesters are able to enter various reactions such as:

  • reduction to alcohol (monolignols) or aldehydes;
  • stilbene and flavonoid biosynthesis;
  • transfer to acceptor molecules.

It should finally be pointed out that the activation of the carboxyl group can also be obtained through an UDP-glucose-dependent transfer to glucose.

Biosynthesis of malonyl-CoA

Malonyl-CoA does not derived from the phenylpropanoid pathway, but from the reaction catalyzed by acetyl-CoA carboxylase (EC, the cytosolic form, see below). The enzyme, with biotin and Mg2+ as cofactors, catalyzes the ATP-dependent carboxylation of acetyl-CoA, using bicarbonate as a source of carbon dioxide (CO2).

Acetyl-CoA + HCO3 + ATP → Malonyl-CoA + ADP + Pi

It is found both in the plastids, where it participates in the synthesis of fatty acids, and the cytoplasm, and is the latter that catalyzes the formation of malonyl-CoA that is used in the biosynthesis of flavonoids and other compounds. Increases in the transcription rate of the gene and enzyme activity are induced in response to stimuli that increase the biosynthesis of these polyphenols, such as exposure to pathogenic fungi or UV-rays.
In turn, acetyl-CoA is produced in plastids, mitochondria, peroxisomes and cytosol through different metabolic pathways. The molecules used in the biosynthesis of malonyl-CoA, and therefore of the flavonoids, are  the cytosolic ones, produced in the reaction catalyzed by ATP-citrate lyase (EC that cleaves citrate, in the presence of CoA and ATP, to form oxaloacetate and acetyl-CoA, plus ADP and inorganic phosphate.

First steps in flavonoid biosynthesis

The first step in flavonoid biosynthesis is catalyzed by chalcone synthase (EC, an enzyme anchored to the endoplasmic reticulum and with no known cofactors.
From one p-coumaroyl-CoA and three malonyl-CoA, it catalyzes sequential condensation and decarboxylation reactions in the course of which a polyketide intermediate is formed. The polyketide undergoes cyclizations and aromatizations leading to the formation of the A ring. The product of the reactions is naringenin chalcone (2′,4,4′,6′-tetrahydroxychalcone), a 6′-hydroxychalcone and the first flavonoid to be synthesized by plants.

p-Coumaroyl-CoA + 3 Malonyl-CoA → Naringenin Chalcone + 4 CoA + 3 CO2

The reaction, cytosolic, is irreversible due to the release of three CO2 and 4 CoA.
The B ring and the three-carbon bridge of the molecule originate from p-coumaroyl-CoA (and therefore from phenylalanine), the A ring from the three malonyl-CoA units.

Flavonoid biosynthesis and the origin of the flavonoid skeleton
The Origin of the Flavonoid Skeleton

Also 6’-deoxychalcone can be produced; its synthesis is thought to involve an additional reduction step catalyzed by polyketide reductase (EC. 1.1.1.-).
Chalcone synthase from some plant species, such as barley (Hordeum vulgare), accepts as substrates also caffeoil-CoA, feruloil-CoA and cinnamoyl-CoA.
It is the most abundant enzyme of the phenylpropanoid pathway, probably because it has a low catalytic activity, and, in fact, is considered to be the rate-limiting enzyme in flavonoid biosynthesis.
As for phenylalanine ammonia lyase, chalcone synthase gene expression is under the control of both internal and external factors. In some plants, one or two isoenzymes are found, while in others up to 9.
Chalcone synthase belongs to polyketide synthase group, present in bacteria, fungi and plants. These enzymes are able to catalyze the production of polyketide chains through sequential condensations of acetate units provided by malonyl-CoA units. They also includes stilbene synthase (EC, which catalyzes the formation of resveratrol, a non flavonoid polyphenol compound that has attracted much interest because of its potential health benefits.
Generally, chalcones do not accumulate in plants because naringenin chalcone is converted to (2S)-naringenin, a flavanone, in the reaction catalyzed by chalcone isomerase (EC
The enzyme, the first of the flavonoid biosynthesis to be discovered, catalyzes a stereospecific isomerization and closes the C ring. Two types of chalcone isomerases are known, called type I and II. Type I enzymes use only 6′-hydroxychalcone substrates, like naringenin chalcone, while type II, prevalent in legumes, use both 6′-hydroxy- and 6′-deoxychalcone substrates.
It should be noted that with 6′-hydroxychalcones, isomerization can also occur nonenzymically to form a racemic mixture, both in vitro and in vivo, enough to allow a moderate synthesis of anthocyanins. On the contrary, under physiological conditions 6′-deoxychalcones are stable, and so the activity of type II chalcone isomerases is required to form flavanones.
The enzyme increases the rate of the reaction of 107 fold compared to the spontaneous reaction, but with a higher kinetics for the 6′-hydroxychalcones than 6′-deoxychalcones. Finally, it produces (2S)-flavanones, which are the biosynthetically required enantiomers.
As other enzymes in flavonoid biosynthesis, also chalcone isomerase gene expression is subject to strict control. And, as phenylalanine ammonia lyase and chalcone synthase, it is induced by elicitors.
In the reaction catalysed by flavanone-3β-hydroxylase (EC, (2S)-flavanones undergo a stereospecific isomerization that converts them into the respective (2R,3R)-dihydroflavonols. In particular, naringenin is converted into dihydrokaempferol.
The enzyme is a cytosolic non-heme-dependent dioxygenase, dependent on Fe2+ and 2-oxoglutarate, and therefore belonging to the family of 2-oxoglutarate-dependent dioxygenase (which distinguishes them from the other hydroxylases of the flavonoid biosynthetic pathway which are cytochrome P450 enzymes).

Naringenin chalcone, (2S)-naringenin, and dihydrokaempferol are central intermediates in flavonoid biosynthesis, since they act as branch-point compounds from which the synthesis of distinct flavonoid subclasses can occur. For example, directly or indirectly:

Not all of these side metabolic pathways are present in every plant species, or are active within each tissue type of a given plant. Like enzymes previously seen, the activity of those involved in these “side-routes” is subject to strict control, resulting in a tissue-specific profile of flavonoid compounds. For example, grape seeds, flesh and skin have distinct anthocyanin, catechin, flavonol and condensed tannin profiles, whose synthesis and accumulation are strictly and temporally coordinated during the ripening process.


Andersen Ø.M., Markham K.R. Flavonoids: chemistry, biochemistry, and applications. CRC Press Taylor & Francis Group, 2006

de la Rosa L.A., Alvarez-Parrilla E., Gonzàlez-Aguilar G.A. Fruit and vegetable phytochemicals: chemistry, nutritional value, and stability. 1th Edition. Wiley J. & Sons, Inc., Publication, 2010

Heldt H-W. Plant biochemistry – 3th Edition. Elsevier Academic Press, 2005

Vogt T. Phenylpropanoid biosynthesis. Mol Plant 2010;3(1):2-20. doi:10.1093/mp/ssp106

Wink M. Biochemistry of plant secondary metabolism – 2nd Edition. Annual plant reviews (v. 40), Wiley J. & Sons, Inc., Publication, 2010

Lignans: structure, metabolism, benefits, and foods

Lignans are a subgroup of non-flavonoid polyphenols.
They are widely distributed in the plant kingdom, being present in more than 55 plant families, where they act as antioxidants and defense molecules against pathogenic fungi and bacteria.
In humans, epidemiological and physiological studies have shown that they can exert positive effects in the prevention of lifestyle-related diseases, such as type II diabetes and cancer. For example, an increased dietary intake of these polyphenols correlates with a reduction in the occurrence of certain types of estrogen-related tumors, such as breast cancer in postmenopausal women.
In addition, some lignans have also aroused pharmacological interest. Examples are:

  • podophyllotoxin, obtained from plants of the genus Podophyllum (Berberidaceae family); it is a mitotic toxin whose derivatives have been used as chemotherapeutic agents;
  • arctigenin and tracheologin, obtained from tropical climbing plants; they have antiviral properties and have been tested in the search for a drug to treat AIDS .


Chemical structure of lignans

Their basic chemical structure consists of two phenylpropane units linked by a C-C bond between the central atoms of the respective side chains (position 8 or β), also called β-β’ bond. 3-3′, 8-O-4′, or 8-3′ bonds are observed less frequently; in these cases the dimers are called neolignans. Hence, their chemical structure is referred to as (C6-C3)2, and they are included in the phenylpropanoid group, as well as their precursors: the hydroxycinnamic acids (see below).

Skeletal formula of phenylpropanoid unit of lignans
Fig. 1 – Phenylpropanoid unit

Based on their carbon skeleton, cyclization pattern, and the way in which oxygen is incorporated in the molecule skeleton, they can be divided into 8 subgroups: furans, furofurans, dibenzylbutanes, dibenzylbutyrolactones, dibenzocyclooctadienes, dibenzylbutyrolactols, aryltetralins and arylnaphthalenes. Furthermore, there is considerable variability regarding the oxidation level of both the propyl side chains and the aromatic rings.
They are not present in the free form in nature, but linked to other molecules, mainly as glycosylated derivatives.
Among the most common lignans, secoisolariciresinol (the most abundant one), lariciresinol, pinoresinol, matairesinol and 7-hydroxymatairesinol are found.

Note: They occur not only as dimers but also as more complex oligomers, such as dilignans and sesquilignans.

Synthesis of lignans

In this section, we will examine the synthesis of some of the most common lignans.
The pathway starts from 3 of the 4 most common dietary hydroxycinnamic acids: p-coumaric acid, sinapic acid, and ferulic acid (caffeic acid is not a precursor of this subgroup of polyphenols). Therefore, they arise from the shikimic acid pathway, via phenylalanine.

Synthesis pathways for lignans
Fig. 2 – Lignan Biosynthesis

The first three reactions reduce the carboxylic group of the hydroxycinnamates to alcohol group, with formation of the corresponding alcohols, called monolignols, that is, p-coumaric alcohol, sinapyl alcohol and coniferyl alcohol. These molecules also enter the pathway of lignin biosynthesis.

  • The first step, which leads to the activation of the hydroxycinnamic acids, is catalysed by hydroxycinnamate:CoA ligases, commonly called p-coumarate:CoA ligases (EC, with formation of the corresponding hydroxycinnamate-CoAs, namely, feruloil-CoA, p- coumaroyl-CoA and sinapil-CoA.
  • In the second step, a NADPH-dependent cinnamoyl-CoA: oxidoreductase, also called cinnamoyl-CoA reductase (EC1.2.1.44) catalyzes the formation of the corresponding aldehydes, and the release of coenzyme A.
  • In the last step, a NADPH-dependent cinnamyl alcohol dehydrogenase, also called monolignol dehydrogenase (EC, catalyzes the reduction of the aldehyde group to an alcohol group, with the formation of the aforementioned monolignols.

The next step, the dimerization of monolignols, involves the intervention of stereoselective mechanisms, or, more precisely, enantioselective mechanisms. In fact, most of the plant lignans exists as (+)- or (-)-enantiomers, whose relative amounts can vary from species to species, but also in different organs on the same plant, depending on the type of reactions involved.
The dimerization can occur through enzymatic reactions involving laccases (EC These enzymes catalyze the formation of radicals that, dimerizing, form a racemic mixture. However, this does not explain how the racemic mixtures found in plants are formed. The most accepted mechanism to explain the stereospecific synthesis involves the action of the laccase and of a protein able to direct the synthesis toward one or the other of the two enantiomeric forms: the dirigent protein. The reaction scheme might be: the enzyme catalyzes the synthesis of phenylpropanoid radicals that are orientated in such a way to obtain the desired stereospecific coupling by the dirigent protein.

Skeletal formula of the lignan (-)-matairesinol
Fig. 3 – (-)-Matairesinol

For example, pinoresinol synthase, consisting of laccase and dirigent protein, catalyzes the stereospecific synthesis of (+)-pinoresinol from two units of coniferyl alcohol. (+)-Pinoresinol, in two consecutive stereospecific reactions catalyzed by NADPH-dependent pinoresinol/lariciresinol reductase (EC, is first reduced to (+)-lariciresinol, and then to (-)-secoisolariciresinol. (-)-Secoisolariciresinol, in the reaction catalyzed by NAD(P)-dependent secoisolariciresinol dehydrogenase (EC is oxidized to (-)-matairesinol.

Metabolism by human gut microbiota

Their importance to human health is due largely to their metabolism by gut microbiota, which carries out deglycosylations, para-dehydroxylations, and meta-demethylations without enantiomeric inversion. Indeed, this metabolization leads to the formation molecules with a modest estrogen-like activity (phytoestrogens), a situation similar to that observed with some isoflavones, such as those of soybean, some coumarins, and some stilbenes. These active metabolites are the so-called “mammalian lignans or enterolignans”, such as the aglycones of enterodiol and enterolactone, formed from secoisolariciresinol and matairesinol, respectively.
Studies conducted on animals fed diets rich in lignans have shown their presence as intact molecules, in low concentrations, in serum, suggesting that they may be absorbed as such from the intestine. These molecules exhibit estrogen-independent actions, both in vivo and in vitro, such as inhibition of angiogenesis, reduction of diabetes, and suppression of tumor growth.
Note: The term “phytoestrogen” refers to molecules with estrogenic or antiandrogenic activity, at least in vitro.

Once absorbed, they enter the enterohepatic circulation, and, in the liver, may undergo phase II reactions and be sulfated or glucuronidated, and finally excreted in the urine.

Food sources

The richest dietary source is flaxseed (linseed), that contains mainly secoisolariciresinol, but also lariciresinol, pinoresinol and matairesinol in good quantity (for a total amount of more than 3.7 mg/100 g dry weight). They are also found in sesame seeds.

Skeletal formula of the lignan (-)-secoisolariciresinol
Fig. 4 – (-)-Secoisolariciresinol

Another important source is whole grains.
They are also present in other foods, but in concentrations from one hundred to one thousand times lower than those of flaxseed. Examples are:

  • beverages, generally more abundant in red wine, followed in descending order by black tea, soy milk and coffee;
  • fruits, such as apricots, pears, peaches, strawberries;
  • among vegetables, Brassicaceae, garlic, asparagus and carrots;
  • lentils and beans.

Their presence in grains and, to a lesser extent in red wine and fruit, makes them, at least in individuals who follow a Mediterranean-style eating pattern, the main source of phytoestrogens.


Andersen Ø.M., Markham K.R. Flavonoids: chemistry, biochemistry, and applications. CRC Press Taylor & Francis Group, 2006

de la Rosa L.A., Alvarez-Parrilla E., Gonzàlez-Aguilar G.A. Fruit and vegetable phytochemicals: chemistry, nutritional value, and stability. 1th Edition. Wiley J. & Sons, Inc., Publication, 2010

Heldt H-W. Plant biochemistry – 3th Edition. Elsevier Academic Press, 2005

Manach C., Scalbert A., Morand C., Rémésy C., and Jime´nez L. Polyphenols: food sources and bioavailability. Am J Clin Nutr 2004;79(5):727-747. doi:10.1093/ajcn/79.5.727

Satake H, Koyama T., Bahabadi S.E., Matsumoto E., Ono E. and Murata J. Essences in metabolic engineering of lignan biosynthesis. Metabolites 2015;5:270-290. doi:10.3390/metabo5020270

Tsao R. Chemistry and biochemistry of dietary polyphenols. Nutrients 2010;2:1231-1246. doi:10.3390/nu2121231

van Duynhoven J., Vaughan E.E., Jacobs D.M., Kemperman R.A., van Velzen E.J.J, Gross G., Roger L.C., Possemiers S., Smilde A.K., Doré J., Westerhuis J.A.,and Van de Wiele T. Metabolic fate of polyphenols in the human superorganism. PNAS 2011;108(suppl. 1):4531-4538. doi:10.1073/pnas.1000098107

Wink M. Biochemistry of plant secondary metabolism – 2nd Edition. Annual plant reviews (v. 40), Wiley J. & Sons, Inc., Publication, 2010