Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Georgiana Bere
    @georgiana-b
    Hello! Thank you for this great tool! Is there a way to get the language that has been detected from a DateDataParserinstance? I would like to change the 'DATE_ORDER' to 'MDY` if the language is English because I use it on European data and British order differs from the American one
    Waqas Shabir
    @waqasshabbir
    Hello @georgiana-b unfortuntately, atm, there is no direct way you can get the detected language. Since, there could be interesting use cases for language, i'll try to add this feature in the coming release.
    Sakari Vaelma
    @vaelma
    Hi, I'm doing some work on the Finnish language file, and ran into a problem: in Finnish the word order is sometimes different, e.g. "in 1 days" would be "1 päivän päästä", where "päivän" would mean "day" and "päästä" roughly "in". If these are translated directly, they will result in "1 day in". It will work, but it feels hack-ish and the tests will look strange. Is there a way to reorder the words at the moment?
    Waqas Shabir
    @waqasshabbir
    Hi @vaelma for this kind of translations we normally use simplifications
    try adding (\d+) (päivän| vuosi|kuukausi|viikko|tunti|minuutti|sekuntti) päästä: kuluttua \1 \2
    Sakari Vaelma
    @vaelma
    Makes sense, thanks! I already made a bunch of widely used "simplifications": https://github.com/vaelma/dateparser/blob/fin-datefix/data/languagefiles/fi.yaml
    I'll update them and the tests, too.
    I was just afraid to make the "target" string into something ungrammatical, although I guess I should think about it as just as a kind of intermediate version on its way to the English translation?
    Sakari Vaelma
    @vaelma
    Another thing I'm wondering about: in Finnish the most common way to write dates in prose is "12. tammikuuta 2015", where the dot denotes ordinals (same as -st, -nd, -th in English). They work now with my additions to the language file, but the only way to make the tests pass was to write them in the same fashion in English: https://github.com/vaelma/dateparser/blob/fin-datefix/tests/test_languages.py#L134 How should it be done properly, or is this fine?
    Waqas Shabir
    @waqasshabbir
    Yes that's the right way to put it. Re: 12. try adding "." to skip: https://github.com/vaelma/dateparser/blob/fin-datefix/data/languagefiles/fi.yaml#L5
    Sakari Vaelma
    @vaelma

    Great, now the tests look much nicer!
    I have skip: [":n", "."] in the lang file (the first one accounts for the genetives that are sometimes used), but still getting errors from the nosetest:

    - 1 january 2016
    + 1. january 2016

    for my assertion: param('fi', "1. tammikuuta, 2016", "1 january 2016"),
    The dot is directly connected to the number, i.e. there is no space between them. Would that cause this problem?

    Waqas Shabir
    @waqasshabbir
    I don't think space is the issue. Can you leave that out and create a PR? I'll dig it deeper shortly
    Sakari Vaelma
    @vaelma
    Sure thing. Thanks!
    Oleg Lebedev
    @olebedev
    Hi guys!
    Such a great tool you made, thanks. I am looking for the same library written in Golang, don't you know is the any? Otherwise, I will have to port this one to Golang...
    Waqas Shabir
    @waqasshabbir
    Hi @olebedev glad that you find it helpful. Don't know any written in Golang.
    eagleyes
    @eagleyes
    Guys, I have a question for you ! is the dateparser able to parse date from normal text like "what is the weather today" , I have tried to do it using the parser method , but returns none. the sentence "today" is ok , but if I add anything outside date field it doesn't work. let me know !
    Artur Sadurski
    @asadurski
    Hello @eagleyes ! No, dateparser can't handle that kind of operations at the moment - you need to pass specific date string, without any other words. Still, it is one of its future features: scrapinghub/dateparser#82.
    Atul Krishna
    @atultherajput
    hello @asadurski I want to participate for gsoc this year. I am interested to work on "Find and Parse Expressions of Times in Large Texts" idea for dateparser. Any advice?
    Harikrishnan Shaji
    @har777
    Is there PREFER_MONTH_OF_YEAR functionality similar to PREFER_DAY_OF_MONTH but for month ?
    Artur Sadurski
    @asadurski
    @har777: No, we don't have that at the moment. I recommend pre-processing the strings as a workaround (if that's possible in your use case).
    Feel free to open feature request.
    exioReed
    @exioReed

    Hello everyone, I just experimented with dateparser in a small cli application for more flexible and natural date parsing. I noticed that importing the module takes quite some time:

    $ time python -c 'import dateparse': ~2.5 s on MacBook Air Mid 2012, ~1.3 on a Desktop with Intel i5 3470.

    I'm wondering if anyone encounter similar times and if I'm missing something obvious which could improve the import time (e.g. configuration)?

    stevespark
    @stevespark
    Hi All, I am looking into extending some of the supported languages. Is there a difference between for example Tuesday and tuesday in the language files?
    Krzysztof Koziarek
    @krzynio
    Hi, I have a question - how to recognize which fields (month, day) was filled automatically (when parsed date is not complete)?
    BrenenP
    @doonce
    I just had an instance where dateparser parsed "2/3 6:30pm" as both february and march on two back-to-back runs. Is there anything I can do to prevent that?
    @exioReed: Could you check with the new release? It's supposed to run and be imported quicker.
    Artur Sadurski
    @asadurski
    @stevespark: No, month/weekday names are not case sensitive, we usually use re.IGNORECASE when matching them, but following the beautiful rules of orthography is the right thing to do, isn't it?
    exioReed
    @exioReed
    @asadurski I haven't experimented much yet, but the import is way faster now. Thanks :-)
    Lee Sai Mun
    @SaiMun92
    Hi guys , how do i parse the text "the 3rd day of October 2011" into a readable date format. The dateparser returns null. Any suggestions guys?
    Lee Sai Mun
    @SaiMun92
    solved it! I just had to remove the word day and its all fine now.
    StudentForever
    @StudentForever
    Hi
    import dateparser
    dateparser.parse("20180201")
    datetime.datetime(8020, 2, 1, 1, 0)
    am I doing something wrong here, getting an in correct date while parsing ?
    Artur Sadurski
    @asadurski
    @StudentForever: I think your situation is similar to scrapinghub/dateparser#373 and the temporary workaround would be to use date_formats=["%Y%m%d"].
    StudentForever
    @StudentForever
    @asadurski Thanks a lot.
    royee17
    @royee17
    Hi, I'm using 'search_dates' function to detect date in a string. Is there any way to detect which kind of Date I detected in the received string? for instance, if it detected a yearOnly(ex: "in 2014"), a monthOnly(ex:"ïn May") or FullDate(ex:"May 6, 2016") etc. Thanks!
    Alexandra Babicheva
    @Ala1s
    hi, i get ('2 11 JUL 17:41', datetime.datetime(2015, 7, 2, 17, 41)) wheras correct would be ('11 JUL 17:41', datetime.datetime(2015, 7, 11, 17, 41)) . How can i fix it?
    "2" belongs to different entity in the text and there is no way to remove withot messing up with the dates
    solveretur
    @solveretur
    Hi everyone, I would like to search dates in string but many of dates are in format dd.MM.yyyy and when I use search_dates method it returns only a year. Is there any way to do it properly ?
    Jason Kiley
    @jtkiley

    Hi, I'm trying to reason through how to deal with timezone info in some dates I'm parsing with dateparser, and I could use a little help.

    The dates look like this 'November 13, 2012 7:00 AM ET', and dateparser parses it to datetime.datetime(2012, 11, 13, 7, 0, tzinfo=<StaticTzInfo 'ET'>). The issue is that, if I convert it to iso format, it seems that it's always getting a -5:00 offset. That's fine for this date, but it's wrong if you change the month to May, where the offset should be -4:00.

    I do care about the fact that these are Eastern time (the main purpose is linking with financial market data; a big majority are ET), so I'd like to preserve that. However, I also need to move these dates around a bit (e.g., into a database), and I'm concerned about the conversions applying this static offset (instead of interpreting ET as EST or EDT depending on the date). Any advice or best practices?

    Niverhawk
    @Niverhawk_twitter
    Hello i am trying to use dateparser.. but whatever i do i get the following error: AttributeError: 'module' object has no attribute 'parse' i used import dateparser and dateparser.parse('02-02-2020')
    The problem solved itself.. sorry for the spam :(
    Dylan Smith
    @Chemsmith
    o/ anyone around? I have an issue with dateparser returning None for most valid inputs