データの種類 |
時系列
表
|
データの変数(パラメーター)の変数名 |
review_id|longitude|latitude|altitude|review_date|temperature|rating|user_id|user_birthday|user_nationality|category user_career|double user_income |
データの概要説明 |
The dataset is about travelers' personal information.The number of instances is 2000'000 and the number of attributes is 12. The data are collected to find out which factors could affect travelers when they are rating about POI. |
想定しているデータの分析・シミュレーションプロセス |
The input set are No.2-No.12attributes.(This is beacause the first attribute is ID which should not be used to exercise).The output set are factors that most likely have relationship with rating.
We can use some regression method to find out the specific relationship with rating.
At the same time,there are some missing in user_income.We can use career and natian information to forecast and fill the losing. |
想定しているデータの分析・シミュレーションプロセスの結果 (データ分析結果/ツールの出力/典型例など) |
By using this dataset,I can have the answer about important factors that affect rating and the relationship among these factors. |
上記の分析・シミュレーションプロセス以外に期待する分析 |
Use more suitable regression methods. |
コメントフォーム