Handle non-unique oracle card attributes

Though it should never be the case, a scryfall data dump included a card
wherein the oracle text differed between variants having the same oracle
id. Using DISTINCT ON with an ORDER BY clause will select just one
distinct version of these fields to be inserted into the database.

The dump from 20230617212037 included a single oracle entry breaking
this assumed guarantee with versions containing differing oracle text:

    \x on
    SELECT DISTINCT "oracle_id"
         , "name"
         , "color_identity"
         , "cmc"
         , "mana_cost"
         , "type_line"
         , "edhrec_rank"
         , "oracle_text"
    FROM "tmp_cards"
    WHERE "tmp_cards"."oracle_id" = '5feedfb0-30e6-400d-9e28-d541ea1aa14e';

    oracle_id	5feedfb0-30e6-400d-9e28-d541ea1aa14e
    name	Plague Spitter
    color_identity	B
    cmc	3.00
    mana_cost	{2}{B}
    type_line	Creature — Phyrexian Horror
    edhrec_rank	7226
    oracle_text	At the beginning of your upkeep, Plague Spitter deals 1 damage to each creature and each player.
    When Plague Spitter dies, Plague Spitter deals 1 damage to each creature and each player.

    oracle_id	5feedfb0-30e6-400d-9e28-d541ea1aa14e
    name	Plague Spitter
    color_identity	B
    cmc	3.00
    mana_cost	{2}{B}
    type_line	Creature — Phyrexian Horror
    edhrec_rank	7226
    oracle_text	At the beginning of your upkeep, Plague Spitter deals 1 damage to each creature and each player.
    When Plague Spitter dies, it deals 1 damage to each creature and each player.
This commit is contained in:
Correl Roush 2023-06-21 21:39:58 -04:00
parent 306d2f2ba0
commit 7c1050a7c2

View file

@ -252,7 +252,8 @@ def update_scryfall(ctx, filename):
await cursor.execute(
"""
INSERT INTO "oracle"
SELECT DISTINCT "oracle_id"
SELECT DISTINCT ON("oracle_id")
"oracle_id"
, "name"
, "color_identity"
, "cmc"
@ -262,6 +263,14 @@ def update_scryfall(ctx, filename):
, "oracle_text"
FROM "tmp_cards"
WHERE "tmp_cards"."oracle_id" IS NOT NULL
ORDER BY "oracle_id"
, "name"
, "color_identity"
, "cmc"
, "mana_cost"
, "type_line"
, "edhrec_rank"
, "oracle_text"
ON CONFLICT (oracle_id) DO UPDATE
SET "name" = "excluded"."name"
, "color_identity" = "excluded"."color_identity"