So I created a new feature. Especially two new features but one of them serves only to create the other one. Its again about the Family state of a person but this time about the number of family members which travels with that person.
It's the “Number of Family Members”. To get there I needed to create another feature, the Family Name or Lastname. Yes, there is a name column in the dataset but there is the hole name listed. Here I used the split function again and extract all family names. With this information, I was able to count the Family Names and created this new feature.
I am curious how the model will react.
Got these in the mail today. What a thoughtful gift from @chrimeo (subscription to @therevjournal)!!!
F1 is easily my favorite sport... as a goal for 2019, I want to take at least 2 weekends off to go watch #F1 races with her in person (she’s a fan too).
It may seem strange to make a goal to work less, but sometimes it’s important to stop and do thing that we love besides working (which I love as well).
In addition to the racing, it’s interesting as a #datascientist to see the massive amount of #bigdata that teams crunch in real-time (telemetry!) during racing and to develop the cars.
Awesome intersection between #sports and #technology - if you’re not a fan you should check it out.
And btw, @mclaren is hiring #datascience roles in London right now if you’re interested!
Chipping away at those New Years resolutions one day at a time. That’s all you can really ask for! It’s super easy to get overwhelmed by the sheer amount of stuff there is to learn, especially if you’re in an ever-evolving tech field.
The hardest part for any beginner (or anyone with goals, really) is figuring out where to start. For example, data science: where the heck do you even begin?! Sometimes it can feel like an insurmountable goal to be a data scientist.
The most common advice you hear is to just START. I know you’ve heard that before, and it’s not really that helpful. My advice is this: think about the areas of data science you are considering diving into and write out an actual list - whether it’s python, Tableau, ML, AI, etc. and then figure out which one pulls you the most. Which topic stands out to you more than the rest as an area of interest? Run with that gut feeling and just GO WITH IT! Start there and chip away at it every day. Repeat this process item by time down your list, and good luck! Don’t forget to reach out for help. We all need help every now and then (/always 😂)💜
Blockchain will reveal new opportunities in different industries
Everyone is now talking about blockchain, a revolutionary decentralized technology that stores and exchanges data for cryptocurrencies. It forms a distributed database with a digital register of the transactions and contracts. Blockchain stores an ever-growing list of ordered records called blocks, each containing a timestamp and a link to the previous block. Blockchain has impressive prospects in the field of digital transactions which will open new business opportunities in 2018.
This technology also uncovers many new possibilities with various applications in various other fields. Due to the growing role of social responsibility and security on the internet, the blockchain technologies are becoming increasingly relevant. In a system using blockchain, it is nearly impossible to forge any digital transactions, so the credibility of such systems will surely strengthen. This approach can become fundamental for disruptive digital business in enterprises and startups. Companies, previously operating offline, will be able to translate the processes into the digital environment completely.
Business needs to account for the blockchain risks and opportunities and analyze how this technology can influence the customer behavior. As the initial hype around blockchain in the financial services' industry will slow down, we will see many more potential use cases for the government, healthcare, manufacturing, and other industries. For example, blockchain strongly influences the intellectual property management and opens new insights in protection from copyright infringement. #datascientist#pythonprogramming#embeddedsystems#learneveryday#arduino#kaggle#edx#github#webdesign#lautechblog#blogger#lifeofablogger#bigbangtheory#google#googlers#microsoft#artificialintelligence
1 229 hours ago
Sallie Krawcheck, CEO & founder of @ellevest, is the third woman we're honoring as part of our “7 to Inspire” series before the @synapseflorida Summit. We're so impressed with the mission of her company, which uses an algorithm tailored specifically to women’s incomes and life cycles to help them reach their financial goals. How cool is that? #synapsesummit
There is no excuse to stop learning.
Sebuah percakapan hari ini dengan Mas Bobby: "Wih mas, weekend gini masih semangat ya ngurusin komunitas GNUR". "Iya, buat sharing sesama teman-teman, supaya ekosistem data technology cepet terbentuk di Indonesia. Supaya kita gak terlalu telat dibanding dengan negara lain. Kita masih bisa mengejar ketertinggalan dan belajar kok selama kita mau. Usia bukanlah suatu halangan. Jadi, ketika kita weekday sibuk kerja, at least weekend bisa kita gunakan untuk sedikit berkontribusi untuk masyarakat melalui komunitas ini".
Terima kasih GNU-R yang sudah mempertemukanku dengan orang-orang pintar yang cinta bangsanya :")
📣По последним новостям Linkedin, верхушку списка самых многообещающих профессий в 2019 году в США возглавляет позиция Data Scientist:
💸Средняя зп: 130 000$/год
✔️Открытых вакансий: +4000
💡ТОП навыки: Data Science, Data Mining, Data Analysis, Python, Machine Learning
Хорошая новость для тех, кто уже связал свою жизнь с большими данными🤩🎓
So I read about the fact that a KNN needs to have input features that are predictive. Other machine learning models can ignore non-predictive features, but kNN can't ignore a feature. Therefore I pruned some more features out of my model. And it seems working. But what about the fact that a KNN needs input features that are predictive? When I create a new feature that is predictable, I could increase my model’s accuracy rate ... mmmmhhh 🤔 I need to think about that.
Learn DATA SCIENCE in a somehow EASY STEPS.
Do you believe that DATA SCIENCE is just the combination of COMPUTER SCIENCE and DATA MINING?
14 62511:08 PM Nov 19, 2018
[7/100] Woke up at 6AM, went to the gym, ate breakfast, AND made my way to the office. The week's off to a great start! Tonight's coding course sesh will be focused on pandas, one of my all-time favorite Python libraries (tbt to when I wrote a whole tutorial on working with strings in pandas, maybe I should finally publish that 🤔). What are your favorite libraries or frameworks? #100DaysOfData 🐼
105 37065:06 PM Jan 7, 2019
[16/100] I’m back! I knew keeping up with #100DaysOfData would be a real challenge, and that I’d probably fail a few times. But I’m following @multimicah’s concept of “striving for failure” - taking notes and moving the ball forward, as planned (his explanation is better than mine... so def follow him + subscribe to his newsletter if you’re intrigued by this topic).
So, a quick update: I skipped these last two days of course work, but today I’ll be picking back up on the second section of “Working with Data” lectures. Also, the random window of hexbins on my screen in this pic is unrelated - just me editing GeoJSON files 😂
Hope you’re having a great week! ✌️
66 30493 days ago
[12/100] Remember that livestream I brought up a few days ago? Well, exciting news: 1. It’s happening!!!! 2. @_init_nat is joining me!!!!! 💯🎉🤩 Natalie is a data scientist and amazing human, so if you haven’t checked out her account yet, that’s step one. Step two: we’re rounding up our most frequently asked data science-y questions to answer during the livestream – so if you have a question, please feel free to DM/leave it in the comments!!! Step three: TUNE IN at 12PM EST tomorrow (1/13) – and we’ll see you there! 🤗 #100DaysOfData
51 28585:53 PM Jan 12, 2019
[10/100] From looking at my posts, you might wonder if I ever leave my apartment, and the truth is... I don’t 😂
Except when I do, of course 🤷🏻♀️ Today has been full of running around and being busy with work/life in the best possible way. That’s why this particular #100DaysOfData post is coming to you late and without much fanfare... But I’m looking forward to getting back to this couch, curling up with my laptop and digging into some Python data cleaning lectures 📊
36 217312:17 AM Jan 11, 2019
Simple linear regression is a statistical method that allows us to summarize and study relationships between two continuous (quantitative) variables. Simple linear regression gets its adjective “simple” because it concerns the study of only one predictor variable.
Shout out to @avik._.jain for this infographic.