alignment tax in English

[Show JSON for postprocessed kaikki.org data shown on this page ▼] [Hide JSON for postprocessed kaikki.org data shown on this page ▲]

{
  "etymology_text": "First attested in a 2019 speech by computer scientist Paul Christiano (see quote), who attributed the idea to AI researcher and writer Eliezer Yudkowsky.",
  "forms": [
    {
      "form": "alignment taxes",
      "tags": [
        "plural"
      ]
    }
  ],
  "head_templates": [
    {
      "args": {},
      "expansion": "alignment tax (plural alignment taxes)",
      "name": "en-noun"
    }
  ],
  "lang": "English",
  "lang_code": "en",
  "pos": "noun",
  "senses": [
    {
      "categories": [
        {
          "kind": "other",
          "name": "English entries with incorrect language header",
          "parents": [],
          "source": "w"
        },
        {
          "kind": "other",
          "name": "Pages with 1 entry",
          "parents": [],
          "source": "w"
        },
        {
          "kind": "other",
          "name": "Pages with entries",
          "parents": [],
          "source": "w"
        },
        {
          "kind": "other",
          "langcode": "en",
          "name": "Artificial intelligence",
          "orig": "en:Artificial intelligence",
          "parents": [],
          "source": "w"
        }
      ],
      "examples": [
        {
          "bold_text_offsets": [
            [
              26,
              39
            ],
            [
              227,
              240
            ]
          ],
          "ref": "2019 August 29, Paul Christiano, Current work in AI alignment, EA Global San Francisco 2019:",
          "text": "I like this notion of an \"alignment tax\" […] the reason I might compromise is if there's some tension, between having the AI that's robustly trying to do what I want, and having the AI that is competent or intelligent, and the alignment tax is intended to capture that gap—that cost that I incur if I insist on alignment.",
          "type": "quote"
        },
        {
          "bold_text_offsets": [
            [
              130,
              145
            ]
          ],
          "ref": "2021 December 1, Askell, A. et. al., “A General Language Assistant as a Laboratory for Alignment”, in arXiv, →DOI:",
          "text": "The fact that larger models are less subject to forgetting may be related to the fact that larger models do not incur significant alignment taxes.",
          "type": "quote"
        },
        {
          "bold_text_offsets": [
            [
              46,
              59
            ]
          ],
          "ref": "2022 March 4, Ouyang, L. et. al., “Training language models to follow instructions with human feedback”, in arXiv, →DOI:",
          "text": "We want an alignment procedure that avoids an alignment tax, because it incentivizes the use of models that are unaligned but more capable on these tasks.",
          "type": "quote"
        },
        {
          "bold_text_offsets": [
            [
              27,
              40
            ]
          ],
          "ref": "2023 February 27, Kornai, A. et. al., “Safety without alignment”, in arXiv, →DOI:",
          "text": "We note that instead of an alignment tax our proposal entails a safety dividend – the more rational the system the more capable and the safer it will be.",
          "type": "quote"
        }
      ],
      "glosses": [
        "A cost to the capabilities of an artificial intelligence resulting from the effects of aligning it with human ethics and morality."
      ],
      "id": "en-alignment_tax-en-noun-7WUfk1ms",
      "links": [
        [
          "artificial intelligence",
          "artificial intelligence"
        ],
        [
          "cost",
          "cost#Noun"
        ],
        [
          "capabilities",
          "capability#Noun"
        ],
        [
          "artificial intelligence",
          "artificial intelligence#Noun"
        ],
        [
          "aligning",
          "align#Noun"
        ],
        [
          "human",
          "human#Adjective"
        ],
        [
          "ethics",
          "ethics#Noun"
        ],
        [
          "morality",
          "morality#Noun"
        ]
      ],
      "qualifier": "artificial intelligence",
      "raw_glosses": [
        "(artificial intelligence) A cost to the capabilities of an artificial intelligence resulting from the effects of aligning it with human ethics and morality."
      ],
      "wikipedia": [
        "Eliezer Yudkowsky"
      ]
    }
  ],
  "word": "alignment tax"
}

[Show JSON for raw wiktextract data ▼] [Hide JSON for raw wiktextract data ▲]

{
  "etymology_text": "First attested in a 2019 speech by computer scientist Paul Christiano (see quote), who attributed the idea to AI researcher and writer Eliezer Yudkowsky.",
  "forms": [
    {
      "form": "alignment taxes",
      "tags": [
        "plural"
      ]
    }
  ],
  "head_templates": [
    {
      "args": {},
      "expansion": "alignment tax (plural alignment taxes)",
      "name": "en-noun"
    }
  ],
  "lang": "English",
  "lang_code": "en",
  "pos": "noun",
  "senses": [
    {
      "categories": [
        "English countable nouns",
        "English entries with incorrect language header",
        "English lemmas",
        "English multiword terms",
        "English nouns",
        "English terms with quotations",
        "Pages with 1 entry",
        "Pages with entries",
        "en:Artificial intelligence"
      ],
      "examples": [
        {
          "bold_text_offsets": [
            [
              26,
              39
            ],
            [
              227,
              240
            ]
          ],
          "ref": "2019 August 29, Paul Christiano, Current work in AI alignment, EA Global San Francisco 2019:",
          "text": "I like this notion of an \"alignment tax\" […] the reason I might compromise is if there's some tension, between having the AI that's robustly trying to do what I want, and having the AI that is competent or intelligent, and the alignment tax is intended to capture that gap—that cost that I incur if I insist on alignment.",
          "type": "quote"
        },
        {
          "bold_text_offsets": [
            [
              130,
              145
            ]
          ],
          "ref": "2021 December 1, Askell, A. et. al., “A General Language Assistant as a Laboratory for Alignment”, in arXiv, →DOI:",
          "text": "The fact that larger models are less subject to forgetting may be related to the fact that larger models do not incur significant alignment taxes.",
          "type": "quote"
        },
        {
          "bold_text_offsets": [
            [
              46,
              59
            ]
          ],
          "ref": "2022 March 4, Ouyang, L. et. al., “Training language models to follow instructions with human feedback”, in arXiv, →DOI:",
          "text": "We want an alignment procedure that avoids an alignment tax, because it incentivizes the use of models that are unaligned but more capable on these tasks.",
          "type": "quote"
        },
        {
          "bold_text_offsets": [
            [
              27,
              40
            ]
          ],
          "ref": "2023 February 27, Kornai, A. et. al., “Safety without alignment”, in arXiv, →DOI:",
          "text": "We note that instead of an alignment tax our proposal entails a safety dividend – the more rational the system the more capable and the safer it will be.",
          "type": "quote"
        }
      ],
      "glosses": [
        "A cost to the capabilities of an artificial intelligence resulting from the effects of aligning it with human ethics and morality."
      ],
      "links": [
        [
          "artificial intelligence",
          "artificial intelligence"
        ],
        [
          "cost",
          "cost#Noun"
        ],
        [
          "capabilities",
          "capability#Noun"
        ],
        [
          "artificial intelligence",
          "artificial intelligence#Noun"
        ],
        [
          "aligning",
          "align#Noun"
        ],
        [
          "human",
          "human#Adjective"
        ],
        [
          "ethics",
          "ethics#Noun"
        ],
        [
          "morality",
          "morality#Noun"
        ]
      ],
      "qualifier": "artificial intelligence",
      "raw_glosses": [
        "(artificial intelligence) A cost to the capabilities of an artificial intelligence resulting from the effects of aligning it with human ethics and morality."
      ],
      "wikipedia": [
        "Eliezer Yudkowsky"
      ]
    }
  ],
  "word": "alignment tax"
}

This page is a part of the kaikki.org machine-readable English dictionary. This dictionary is based on structured data extracted on 2025-06-16 from the enwiktionary dump dated 2025-06-01 using wiktextract (074e7de and f1c2b61). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.

"alignment tax" meaning in English

Noun

Inflected forms