• 0 Posts
  • 10 Comments
Joined 1 year ago
cake
Cake day: June 10th, 2023

help-circle
  • underisk@lemmy.mltoasklemmy@lemmy.mlDeleted
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    1 year ago

    Yeah, exactly. Those aren’t words, they aren’t random, and they’re in a comma separated list. Try asking it to produce something like this:

    Green five the scoured very fasting to lightness air bog.

    Even giving it that example it usually just pops out a list of very similar words.


  • underisk@lemmy.mltoasklemmy@lemmy.mlDeleted
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    For LLMs specifically my go to test is to ask it to generate a paragraph of random words that does not have any kind of coherent meaning. It specifically asks them to do the opposite of what they’re trained to do so it trips them up pretty reliably. Closest I’ve seen them get was a list of comma separated random words and that was after giving them coaching prompts with examples.



  • underisk@lemmy.mltoasklemmy@lemmy.mlDeleted
    link
    fedilink
    English
    arrow-up
    20
    ·
    edit-2
    1 year ago

    There will never be any kind of permanent solution to this. Botting is an arms race and as long as you are a large enough target someone is going to figure out the 11ft ladder for your 10ft wall.

    That said, generally when coming up with a captcha challenge you need to figure out a way to subvert the common approach just enough that people can’t just pull some off the shelf solution. For example instead of just typing out the letters in an image, ask the potential bot to give the results of a math problem stored in the image. This means the attacker needs more than just a drop in OCR to break it, and OCR is mostly trained on words so its likely going to struggle at math notation. It’s not that difficult to work around but it does require them to write a custom approach for your captcha which can deter most casual attempts for some time.


  • For video, I think the best you’re likely to get is embedded players from popular video hosts. The costs and challenges of hosting video content are just not worth it for the people hosting the instances.

    GIF as a format is garbage. Terrible compression, poor quality, and weird quirks. They’re so bad that most platforms that host GIFs are just transparently converting them to MP4 videos because they actually take up fewer resources. If they get added the only way I see it happening is as extremely short form video with strict file size limits.

    If I’m being honest though I don’t miss ithem at all.