Building datasets
See original GitHub issueHello !
Thank you for this open source package, it help a lot and your work is amazing.
I just a have a silly question about dataset construction. I followed the example for my data: user (160.000 x 300) and item (4000 x 4).
dataset = Dataset()
dataset.fit(users=(x['id_user'] for x in user),
items=(x['id_item'] for x in item),
user_features=((x['id_user'], [[x[col] for col in list_columns_user]]) for x in user),
item_features=((x['id_item'], [[x[col] for col in list_columns_item]]) for x in item))
But when I try dataset.user_features_shape()
I get (160000, 160000)
. shouldn’t I rather have this (160000, 300)
?
Indeed, we can read in the documentation :
Returns ------- (num user ids, num user features): tuple of ints
and my num user features is 300. So there is an error in what I did?
Sorry for the stupid question!
Issue Analytics
- State:
- Created 5 years ago
- Comments:16
Top Results From Across the Web
There are 177 building datasets available on data.world.
There are 177 building datasets available on data.world. Find open data about building contributed by thousands of users and organizations across the world....
Read more >How to build your own dataset for Data Science projects
You want to begin with a project, construct a model and run for the results and actively looking for a dataset? Why not...
Read more >Buildings Datasets
Data tables contain statistics related to construction, building technologies, energy consumption, and building characteristics.
Read more >24 Free Datasets for Building an Irresistible Portfolio (2022)
Here are the best places to find free data sets for data visualization, data cleaning, machine learning, and data processing projects.
Read more >Benchmark Datasets for Buildings
Provides infrastructure to identify and summarize previous and current efforts involving data collection for buildings and underlying sub-systems.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
You need to pass an iterable of tuples of
(id, [list of features for that id])
intobuild_features
. It looks like at the moment you’re passing the same features for every user?(Well, you should get 160000 x 1600300 or something like that. Are your feature names the same as some of your user ids?)