These are chat archives for FreeCodeCamp/DataScience

26th
Aug 2016
evaristoc
@evaristoc
Aug 26 2016 09:40

People

Automatic Summarization Project for Chatrooms and Forum

https://en.wikipedia.org/wiki/Automatic_summarization

I think I will continue working on this topic. In general there are few good API's for that, but I wonder if we can develop something that suits the FCC case?

You can take for example an URL of a discussion in the Forum (NOTE: it will be truncated!) and use the following API:
https://www.tools4noobs.com/summarize/

I also tried some clustered texts from a previous project that involved communications in this room. You can pick a group of posts at the end of the following notebook:
https://github.com/evaristoc/fccgitterDataScience/blob/master/Identifying%20Relevant%20Topics%20in%20a%20Chatroom.ipynb

And use the aforementioned API to see how the API summarizes the content of the posts.

The Automatic Summarization project is currently experimental. The actual added value and therefore the opportunities and adoption of this kind of approach are still to be seen but FCC could, for example, present summaries of the weekly content of the main chats/forums in a more comprehensive way that just listing popular questions, or add summary capabilities so a person can get a general approximation of the main arguments in a on-going discussion.

My assumption is that this could help to facilitate the readability of the communications, ease the search behaviour about topics of interests, and increase conversion and engagement. However I need to investigate if those benefits have been true for other situations when a summarizing tool has been used.

Philip Durbin
@pdurbin
Aug 26 2016 11:58
@CodeNonprofit oh, free software rather than open source. Ok.
evaristoc
@evaristoc
Aug 26 2016 12:09
evaristoc
@evaristoc
Aug 26 2016 13:44

A presentation about Search Mechanims and Search Analytics:

One of my favourite presentations so far (I was there, the guy who presented this was exceptional). For this one you have to have an idea about how Elastic Search works, what Lucene is, what TF-IDF is and how the ranking of word is made. Also, the case study is booking.com. Ask questions if interested:
http://www.slideshare.net/PetervanderWeerd1/you-know-for-search
evaristoc
@evaristoc
Aug 26 2016 14:01
Recalling the presentation above by re-reading the slides... Wow... What an interesting job they are doing there...!
Below, another presentation, this time about how the future of text summarization was conceived in 2004 (we just have to realise that the techniques have been there for a while, just evolving...)
http://web.ipb.ac.id/~ir-lab/pdf/saggion04%20%28text%20summarization%29.pdf
Michael D. Johnson
@CodeNonprofit
Aug 26 2016 14:22
@krisgesling Thank you for your help on this. You've made an excellent point about standing on their shoulders from prior surveys. As soon as they get back to you, please let me know. Hopefully they'll be timely - we need this survey to go out like...today. :D
CamperBot
@camperbot
Aug 26 2016 14:22
:cookie: 399 | @krisgesling |http://www.freecodecamp.com/krisgesling
codenonprofit sends brownie points to @krisgesling :sparkles: :thumbsup: :sparkles:
evaristoc
@evaristoc
Aug 26 2016 15:45
This message was deleted
evaristoc
@evaristoc
Aug 26 2016 18:11

@CodeNonprofit
Some researches, questions made and results - for the benchmark:

For what I see, @CodeNonprofit, the challenge IMO is to identify those nonprofit organisations that:

  • rely on Open Source for some of their activities
  • have programmes and project that are technologically-based; I expect though, that if the nonprofit/NGO has a technologically-based program, they will have the required staff and budget for it - so their needs may lay somewhere else!
  • where the current existing solutions offered by main vendors are NOT what the organisation is looking for, but actually requires some kind of customised solution for its project/program to stay ahead
  • where for some reason, they cannot get commodity products for the most general needs

Some recommendations based on that information (off-focus, opportunities):

  • Apparently FCC should offer training about Cloud Solutions?

  • Another aspect to put attention to is the gap of the available training for NGO's and nonprofit regarding Technology. This is something that FCC can exploit by offering training for NGO's + tech support. The idea would be to bring clients and (future) service providers to the same market place and grow together.

    • I can guarantee that, given the nature of FCC there are a lot of campers with nonprofit orientation.
  • Interestingly, one issue is security and privacy. I would suggest to check how Tactical Tech is approaching this issue (mostly through activism).

  • See that there is an opportunity for the FCC project to explore an approach that many nonprofit understand better: making courses that looks more social, not technical oriented.

  • I would also suggest to develop a parallel project to offer FRONT-END services in mass instead of per project, like an AGENCY: front-end needs are so common for an issue to be dismissed, the market really exists and FCC students are more into it...

  • In the customisation, think how to STANDARDISE. Modularisation could be a good idea. FCC can offer modular, even turn-key services to organisations instead of only customised Professional Services. A good market research is required...

evaristoc
@evaristoc
Aug 26 2016 19:27

@CodeNonprofit

In my opinion some of your previous questions might already be partially "answered":

  • do they work for a registered 501(c)(3) charitable organization?
  • does their organization have a budget specifically for technology? (Many of them don't)
  • what types of software do they spend money on? (Usual needs in general are admin and CRM, is this your question, @CodeNonprofit?)
  • is there an existing software solution for their use case?
  • do they make use of free solutions? (Currently many vendors are offering their services for free, so careful with wording...)
  • who in the organization is deciding what software to use? (Not sure about this question...)
  • how long have they used their current solution? (Why the question? What do you want to know?)
  • how satisfied are they with their current software solution? (Depends who you ask, you will have very different answers... not sure about this one...)
  • how can we best spend our volunteer efforts on nonprofits? (What do you have to offer?)

Why we want to know:

  • We’re curious where nonprofits are overpaying for technology the most (Depends... actually they are barely paying for, that's the problem!)
  • We want to allocate our scarce resources (volunteers) carefully (:+1:)
  • We want nonprofits to understand, through our open data results, how other organizations are using technology (Better case studies?)
  • We want nonprofits to learn about off the shelf solutions, but also that custom code is a thing (:+1:)

I think that the survey could make a better impact if:

  • Supports what it is already known (so not re-inventing the wheel...)
  • Emphasise on topics like the use of Non-Vendor Free Software customised options in the market (your last point: off-the-shell):
    • Which organisations are more prone to use them?
    • Why? What are their focuses? When an off-shell is more convenient? When is NOT a preferred option?
    • What makes those organisations different to those that don't use off-shell? Innovative? Has a specialised staff and tech projects? Is it younger than counterparts? Has no budget for vendor options? Are activists in favour of Open Source?
    • What are their needs compared to other organisations?
    • Which organisations knows about those opportunities and how many are prepared to use them?
    • Would the organisation thinking that current solution could be replaced by a vendor solution at some point instead? Why?
    • If you use a VENDOR solution, would you replace that for a OFF-SHELL if the off-shell offers the same results? What do you expect to get in exchange of that change?
    • What are the main needs of non-vendor users?
  • The questionnaire could consist in TWO based on screening: one for those organisations with only Vendor-based technology needs (shorter) and another for organisations that use or would like to use Non-vendor-based ones. HOWEVER take in consideration that FRONT-END is largely a sort of NO-VENDOR, so... how to discriminate? You can ask if one of their priorities is Front-End and then what other tech priorities they have...
  • I don't know if the survey will reach the representative sample required for an statistical generalisation. No opinion about this... As well, I imagine those organisations are hard-to-reach ones, not sure about it...

Hope this helps...

evaristoc
@evaristoc
Aug 26 2016 20:58
Hard to find a good text summarization API that works for the case of summarising the test dialogues I built using my application... Tested several of them...
https://www.quora.com/Natural-Language-Processing-What-are-the-best-realtime-text-summarization-APIs-Services
Kris Gesling
@krisgesling
Aug 26 2016 23:30
@CodeNonprofit classic FCC quick turn around :) They got back to me and aren't able to release the questions publicly
Lightwaves
@Lightwaves
Aug 26 2016 23:37

@evaristoc "who in the organization is deciding what software to use"
This is a good question and a related question if it hasn't been asked

Who in the organization implements or distributes the software

When I was interning the organization I was working with originally had their marketing/visual design guy as the IT person for quite a while before hiring a dedicated IT person

Ankit
@bugwheels94
Aug 26 2016 23:51
@ankit31894
I would love to see statistics on freecodecamp curriculum. How many people joined?challenges they did? time they took? blah blah...
Any one know some thing like this