• 0 Posts
  • 35 Comments
Joined 11 months ago
cake
Cake day: August 8th, 2023

help-circle


  • How good is good do you say?

    We got a pretty good results with CER at 4% and WER at 15%!

    This was on a limited dataset used to test and train which most likely means that if you introduced an even larger dataset with greater variations in handwriting style for testing the numbers might be even worse.

    Very simplified: A risk of a character wrong every 20th character and a word wrong every 7th word. The SER was around 20%.

    There’s an reason why no one has released a good model for western letters yet and why companies pay up to 1€ for capturing data from 10 handwritten pages.

    It will come but OCR isn’t as sexy as developing text2image solutions.




  • To train an AI to recognize handwriting you need a huge dataset of handwriting examples. That is millions of samples of handwritten text + information about what the written text says in every example).

    This is why the best engines only exists as a service in the cloud. The OCR engines you can install lovely that are acceptable, but far from perfect, are commercial. Parascript FormXtra is one of the better commercial ones.

    The only OCR Engine that’s free and really good is Tesseract OCR but it doesn’t handle handwritten text.


  • I’m openly critical against the whole NATO thing and DSA but you’re just being silly, ignorant, a troll or all of the above.

    I’d rather be a part of the Western military industrial complex than being Ukraine since 2014.

    If Russia just could stop aspiring to be the premier asshole of the northern hemisphere, Sweden would still be “neutral” and democratic neighbors of Russia wouldn’t be forced to put huge amounts of tax money into arms instead of healthcare.

    Russia essentially attacked the guy sitting next to them on the bus because they felt the guy was sitting too close.

    Of course everyone on the bus gets scared of the idiot attacking people!




  • mindlight@lemm.eetoAsklemmy@lemmy.mlIs RAID still needed?
    link
    fedilink
    arrow-up
    53
    arrow-down
    1
    ·
    3 months ago

    Yeah and Titanic was unsinkable.

    If the controller in your SSD fries, it doesn’t matter how many unused gigabytes your SSD has got for relocating bad sectors. It is still fried. For you, that data is forever gone.

    This is why you have redundancy. Full redundancy. You can go for RAID1, one disk die and you still have no data loss, or go bananas with RAID6, two full disks can die and you’re still going strong.

    Ps. Spinning harddrives have had hidden sectors used for relocation of bad sectors for ages. It’s nothing new. If you have to much time on your hand, Google harddrive hidden sectors nsa.


  • mindlight@lemm.eetoOpen Source@lemmy.mltext in image translation
    link
    fedilink
    arrow-up
    9
    arrow-down
    1
    ·
    3 months ago

    I don’t have the answer your looking for but maybe a pointer for where to look and what to look for …

    What you want is essentially done in two steps.

    1. Optical Character Recognition - an image consists of pixels. There is no text, just pixels. You need a program that can see the difference between pixels forming an A and an B. Tesseract is a very competent program for this and it’s free. However, it’s command line only but I know there are GUI applications based on Tesseract.

    2. Translate text from one language to another - maybe Dialect?


  • Wall of text, I know, but I had trouble sleeping so… Yeah… Here goes;

    Knowledge is power.

    Here in Sweden there’s a service that has been pouring money on marketing the last two years. The service is called House ID and they let you store all important documents about your house for free… Free… Free?

    So what will they make money on?

    Well, let’s jump 10 years into the future and just imagine the possibilities.

    Criminals can easily check what house owners have upgraded their locks or purchased home alarm systems. They could even purchase data about all the houses in an area that has a specific lock type with a known flaw.

    Your phone is, with all its sensors, a fantastic surveillance device and people happily take it with them wherever they go.

    In the 90’s, when I worked for IBM, the buzzword was “Data mining”. Ordinary people never understood what it was and I was often asked about it. Extremely simplified: look at the data you have and try to read between the lines to generate data that you originally didn’t have.

    The biggest chain of convenient stores in Sweden launched banking services and a pay card around this time. If you used the card for grocery shopping you’d get a monthly bonus and great offers and discounts. So I gave an innocent example of what your purchase data could be used for. They could see that a woman purchased pads on fairly the same time each month or quarter. Now, when cross checking this with purchase history from other women they could see that a lot of those women also purchased chocolate at the same time they purchase pads. Something something with a lot of women getting cravings of chocolate around the same time each month. Yes, it’s a generalization but still a real life example in this case. So they sent out coupons for chocolate, matching the time around when the customer normally purchased pads, and what do you know? The sale of chocolate increased. Significantly.

    Now, pads isn’t a very sensitive subject of you’re older than 15… But think what data Tinder registers. They can’t know for sure if you’re liberal, conservative or even a communist… or can they? By looking at your behavior in their app, what you did, where (Tinder uses GPS, remember?) you did it and when you did it, they can draw conclusions about a lot of things that you never intended to share with them.

    Today there are sensors placed strategically in shopping malls that registers what store windows you stopped to look at. They actually know, with a pretty high certainty, exactly what product in the window that caught your attention. How they can be so accurate you say? Because you have Bluetooth activated and the mall app installed. They just triangulate your exact position.

    All of this is data about you that is correct. You did all of that and it was registered.

    But what if corrupted data was registered? What if that data was the basis for you getting a loan for your dream house? How do you correct a conclusion that is obviously wrong when the bank just tells you that what data they purchase, from who and how they process it is a business secret and they refuse to share any details.

    Now, all sorts of data has always been collected but in the old time it was stored on paper and cross comparison/compiling data was an expensive and tedious task. Today it is not. Today your phone could store and process data that would take months to process in the old times.

    That slowness/inertia acted as a law of nature, protecting us and our life from being mapped.

    It’s not just that data is collected or what data is collected… It’s what it might be used for that should bother you. Not only what is used for today but also what it could be used for tomorrow.






  • Normally it’s not lack of Windows compatibility breaking the use of an application with wine, it’s the frameworks and libraries the application was built with/need to have access to.

    So check what additional libraries and stuff your application have as a prerequisite. Visual C 2005? 2010? DotNet framework? Which version? And so on.

    When you know what the application needs, then you can Google for “wine DotNet 4.5” (just an example) to get a feeling what problems people had and how they solved it.

    Essentially wine needs to know where to find them when you start your application with/in wine.

    Also, if your application uses MSSQL Express or similar, you might be out of luck. So if that’s the case you should start googling on how to get that running (if even possible) before installing.

    Good luck, be stubborn and make sure to have fun. There’s a lot you’ll learn in this adventure of yours that will come in handy again and again in the future.



  • The statistics seem to be based on User Agent. A lot of people"fake" their user agent to avoid fingerprinting and other things.

    I myself used to do it when I wanted to download Windows 10 ISO from Microsoft. If your UA said anything Windows you were forced to use download Microsoft USB Tool. If it said Linux you got a direct link to the ISO.


  • As someone who’s been writing for close to ten years now, I would tend to think I would get paid at this point. Or, at least write for a site that would. I can’t begin to describe in words how frustrating and unfair it’s been to see websites that are “younger” than me become so successful that they’re able to write for their site as a full-time job in just one year.

    This is a truth I’ve seen again and again throughout my career (started my career in IT in the 90’s). Just because you’re awesome at something doesn’t guarantee you a fat bank account. The people who most often succeed are the ones that have at least some knowledge of a subject and some understanding of how business work.

    It doesn’t mean anything that you’re the best at what you do if you don’t know how to get the customer to sign the agreement.