"alignment tax" meaning in English

See alignment tax in All languages combined, or Wiktionary

Noun

Forms: alignment taxes [plural]
Etymology: First attested in a 2019 speech by computer scientist Paul Christiano (see quote), who attributed the idea to AI researcher and writer Eliezer Yudkowsky. Head templates: {{en-noun}} alignment tax (plural alignment taxes)
  1. (artificial intelligence) A cost to the capabilities of an artificial intelligence resulting from the effects of aligning it with human ethics and morality. Wikipedia link: Eliezer Yudkowsky Categories (topical): Artificial intelligence
    Sense id: en-alignment_tax-en-noun-7WUfk1ms Categories (other): English entries with incorrect language header

Inflected forms

Download JSON data for alignment tax meaning in English (3.1kB)

{
  "etymology_text": "First attested in a 2019 speech by computer scientist Paul Christiano (see quote), who attributed the idea to AI researcher and writer Eliezer Yudkowsky.",
  "forms": [
    {
      "form": "alignment taxes",
      "tags": [
        "plural"
      ]
    }
  ],
  "head_templates": [
    {
      "args": {},
      "expansion": "alignment tax (plural alignment taxes)",
      "name": "en-noun"
    }
  ],
  "lang": "English",
  "lang_code": "en",
  "pos": "noun",
  "senses": [
    {
      "categories": [
        {
          "kind": "other",
          "name": "English entries with incorrect language header",
          "parents": [
            "Entries with incorrect language header",
            "Entry maintenance"
          ],
          "source": "w"
        },
        {
          "kind": "topical",
          "langcode": "en",
          "name": "Artificial intelligence",
          "orig": "en:Artificial intelligence",
          "parents": [
            "Computer science",
            "Cybernetics",
            "Computing",
            "Sciences",
            "Applied mathematics",
            "Systems theory",
            "Technology",
            "All topics",
            "Mathematics",
            "Systems",
            "Fundamental",
            "Formal sciences",
            "Interdisciplinary fields",
            "Society"
          ],
          "source": "w"
        }
      ],
      "examples": [
        {
          "ref": "2019 August 29, Paul Christiano, Current work in AI alignment, EA Global San Francisco 2019",
          "text": "I like this notion of an \"alignment tax\" […] the reason I might compromise is if there's some tension, between having the AI that's robustly trying to do what I want, and having the AI that is competent or intelligent, and the alignment tax is intended to capture that gap—that cost that I incur if I insist on alignment.",
          "type": "quotation"
        },
        {
          "ref": "2021 December 1, Askell, A. et. al., “A General Language Assistant as a Laboratory for Alignment”, in arXiv, →DOI",
          "text": "The fact that larger models are less subject to forgetting may be related to the fact that larger models do not incur significant alignment taxes.",
          "type": "quotation"
        },
        {
          "ref": "2022 March 4, Ouyang, L. et. al., “Training language models to follow instructions with human feedback”, in arXiv, →DOI",
          "text": "We want an alignment procedure that avoids an alignment tax, because it incentivizes the use of models that are unaligned but more capable on these tasks.",
          "type": "quotation"
        },
        {
          "ref": "2023 February 27, Kornai, A. et. al., “Safety without alignment”, in arXiv, →DOI",
          "text": "We note that instead of an alignment tax our proposal entails a safety dividend – the more rational the system the more capable and the safer it will be.",
          "type": "quotation"
        }
      ],
      "glosses": [
        "A cost to the capabilities of an artificial intelligence resulting from the effects of aligning it with human ethics and morality."
      ],
      "id": "en-alignment_tax-en-noun-7WUfk1ms",
      "links": [
        [
          "artificial intelligence",
          "artificial intelligence"
        ],
        [
          "cost",
          "cost#Noun"
        ],
        [
          "capabilities",
          "capability#Noun"
        ],
        [
          "artificial intelligence",
          "artificial intelligence#Noun"
        ],
        [
          "aligning",
          "align#Noun"
        ],
        [
          "human",
          "human#Adjective"
        ],
        [
          "ethics",
          "ethics#Noun"
        ],
        [
          "morality",
          "morality#Noun"
        ]
      ],
      "qualifier": "artificial intelligence",
      "raw_glosses": [
        "(artificial intelligence) A cost to the capabilities of an artificial intelligence resulting from the effects of aligning it with human ethics and morality."
      ],
      "wikipedia": [
        "Eliezer Yudkowsky"
      ]
    }
  ],
  "word": "alignment tax"
}
{
  "etymology_text": "First attested in a 2019 speech by computer scientist Paul Christiano (see quote), who attributed the idea to AI researcher and writer Eliezer Yudkowsky.",
  "forms": [
    {
      "form": "alignment taxes",
      "tags": [
        "plural"
      ]
    }
  ],
  "head_templates": [
    {
      "args": {},
      "expansion": "alignment tax (plural alignment taxes)",
      "name": "en-noun"
    }
  ],
  "lang": "English",
  "lang_code": "en",
  "pos": "noun",
  "senses": [
    {
      "categories": [
        "English countable nouns",
        "English entries with incorrect language header",
        "English lemmas",
        "English multiword terms",
        "English nouns",
        "English terms with quotations",
        "en:Artificial intelligence"
      ],
      "examples": [
        {
          "ref": "2019 August 29, Paul Christiano, Current work in AI alignment, EA Global San Francisco 2019",
          "text": "I like this notion of an \"alignment tax\" […] the reason I might compromise is if there's some tension, between having the AI that's robustly trying to do what I want, and having the AI that is competent or intelligent, and the alignment tax is intended to capture that gap—that cost that I incur if I insist on alignment.",
          "type": "quotation"
        },
        {
          "ref": "2021 December 1, Askell, A. et. al., “A General Language Assistant as a Laboratory for Alignment”, in arXiv, →DOI",
          "text": "The fact that larger models are less subject to forgetting may be related to the fact that larger models do not incur significant alignment taxes.",
          "type": "quotation"
        },
        {
          "ref": "2022 March 4, Ouyang, L. et. al., “Training language models to follow instructions with human feedback”, in arXiv, →DOI",
          "text": "We want an alignment procedure that avoids an alignment tax, because it incentivizes the use of models that are unaligned but more capable on these tasks.",
          "type": "quotation"
        },
        {
          "ref": "2023 February 27, Kornai, A. et. al., “Safety without alignment”, in arXiv, →DOI",
          "text": "We note that instead of an alignment tax our proposal entails a safety dividend – the more rational the system the more capable and the safer it will be.",
          "type": "quotation"
        }
      ],
      "glosses": [
        "A cost to the capabilities of an artificial intelligence resulting from the effects of aligning it with human ethics and morality."
      ],
      "links": [
        [
          "artificial intelligence",
          "artificial intelligence"
        ],
        [
          "cost",
          "cost#Noun"
        ],
        [
          "capabilities",
          "capability#Noun"
        ],
        [
          "artificial intelligence",
          "artificial intelligence#Noun"
        ],
        [
          "aligning",
          "align#Noun"
        ],
        [
          "human",
          "human#Adjective"
        ],
        [
          "ethics",
          "ethics#Noun"
        ],
        [
          "morality",
          "morality#Noun"
        ]
      ],
      "qualifier": "artificial intelligence",
      "raw_glosses": [
        "(artificial intelligence) A cost to the capabilities of an artificial intelligence resulting from the effects of aligning it with human ethics and morality."
      ],
      "wikipedia": [
        "Eliezer Yudkowsky"
      ]
    }
  ],
  "word": "alignment tax"
}

This page is a part of the kaikki.org machine-readable English dictionary. This dictionary is based on structured data extracted on 2024-05-18 from the enwiktionary dump dated 2024-05-02 using wiktextract (1d5a7d1 and 304864d). The data shown on this site has been post-processed and various details (e.g., extra categories) removed, some information disambiguated, and additional data merged from other sources. See the raw data download page for the unprocessed wiktextract data.

If you use this data in academic research, please cite Tatu Ylonen: Wiktextract: Wiktionary as Machine-Readable Structured Data, Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), pp. 1317-1325, Marseille, 20-25 June 2022. Linking to the relevant page(s) under https://kaikki.org would also be greatly appreciated.