11,445
questions
0
votes
0
answers
9
views
Trouble loading the celebA dataset for vgg16 model
Im trying to use the celeba dataset for a vgg16 model, however when I try to run this code
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow....
-1
votes
0
answers
10
views
Where can I find a medical dataset of X-ray images for AI-based bone fracture detection? [closed]
I am working on an AI project focused on bone fracture detection to enhance elderly safety. For this, I need a publicly available dataset that includes X-ray images of bones (with fractures and ...
0
votes
0
answers
28
views
Struggling to Fine-Tune LLaMA 3.2 Models: Why Does Base Model Outperform Instruct in My Use Case?
I’ve been trying to fine-tune the LLaMA 3.2-Instruct model on my custom dataset, which is in JSON-style chat format. The dataset is small (around 400 entries) and I can’t share it due to its ...
-2
votes
0
answers
15
views
Creating representative subset for detecting blockchain anomalies task
We have to create cloud solution in which we gather and transform blockchain transactions' data from three networks (solana, bitcoin, ethereum) and then use machine learning methods for anomaly ...
1
vote
0
answers
34
views
Performing Correspondence Analysis in r
I am experimenting Correspondence Analysis with R in the Airline Passenger Satisfaction dataset and I encountered this error
The value of the argument axes is incorrect. ", (tp.R#82): argument ...
-3
votes
0
answers
4
views
looking for dataset mqic-patient-data-100k-sample [closed]
I am looking for dataset which is previously available from below link. any one can help?
www.visualizing.org/mqic-patient-data-100k-sample
0
votes
0
answers
31
views
Trouble loading dataset on Kaggle -- OSError: [Errno 28] No space left on device
I've been trying to load a dataset that consists of a 1.46 GB .tgz file. This file contains ~2 million .pt files that I'm trying to use to train a neural network. I've been trying to load this dataset ...
1
vote
0
answers
12
views
XMorpher model for DMIR - dataset problem
I'm trying to run a GitHub project XMorpher (https://github.com/Solemoon/XMorpher/tree/main) but the writer did not give the right dataset? In code he used:
train_labeled_unlabeled_dir = 'data/...
1
vote
2
answers
43
views
R - add column to dataset with number of times that a row value is repeated [duplicate]
i've searching for how to do this in R, but unfortunately didn't found an easy way to do it.
if i have a dataset called people like this
A
B
John
Student
John
Student
John
Student
Sarah
Student
...
2
votes
1
answer
26
views
How to convert character indices to BERT token indices
I am working with a question-answer dataset UCLNLP/adversarial_qa.
from datasets import load_dataset
ds = load_dataset("UCLNLP/adversarial_qa", "adversarialQA")
How do I map ...
0
votes
1
answer
29
views
Can't iterate over dataset (AttributeError: module 'numpy' has no attribute 'complex'.)
I'm using:
windows
python version 3.10.0
datasets==2.21.0
numpy==1.24.4
I tried to iterate over dataset I just downloaded:
from datasets import load_dataset
dataset = load_dataset("jacktol/atc-...
0
votes
0
answers
11
views
i am trying to Run my code on PKU-MMD dataset but they give me this error
File "/public/usman/SCD-NET/feeder/augmentations.py", line 135, in temporal_cropresize
temporal_context=temporal_context.permute(0, 2, 3, 1).contiguous().view(C * V * M,temporal_crop_length)
...
0
votes
0
answers
17
views
Problem when Training LLM, shape of 3D attn_mask is wrong
I am currently trying to train a LLM using the PyTorch library but i have an Issue which I can not solve. I don't know how to fix this Error. Maybe someone can help me. In the post I will include a ...
1
vote
0
answers
10
views
How to reduce the dataset size with Power BI date slicers for Vega lite visual?
I am new to Vega lite.
Potentially I have over one million rows in a Power BI table. Each row has an associated time stamp. I want to reduce the dataset size with the start and end date slicer in ...
0
votes
1
answer
22
views
Avoid reloading Pytorch datasets
I train CNNs on a relatively stable combination of datasets, but every time I start a training job, there's a 5-10 min wait for the trainer to load my dataframes from disk. Is it possible to avoid ...
1
vote
1
answer
19
views
Calculate accuracy, recall, precision, and balanced accuracy from Confusion matrix
The confusion matrix shows how the actual labels compare with the predicted labels for a binary classification problem.
Using the confusion matrix, calculate the following:
Accuracy: What proportion ...
0
votes
2
answers
40
views
Azure Machine learning studio gives an error File https://aka.ms/bike-rentals/MLTable is empty
I am following these https://microsoftlearning.github.io/mslearn-ai-fundamentals/Instructions/Labs/01-machine-learning.html instructions to train a model in Azure Machine learning studio. I followed ...
1
vote
0
answers
12
views
[Dataset Discovery::Table Search]Can HNSW and other indexes be applied to Table Search
I have previously learned about many efficient indexing structures in the field of vector retrieval, such as HNSW, DiskANN, LSH, etc. These indexes can help us solve ANN (Approximate Nearest Search) ...
2
votes
1
answer
81
views
Save to disk training dataset and validation dataset separately in PyTorch
I want to save train dataset, test dataset, and validation dataset in 3 separate folders.
Doing this for training and testing is easy
# Get training and testing data
all_training_data = getattr(...
0
votes
0
answers
11
views
Deep learning regression models with datasets that have features and labels on different scales and distributions
I am creating a deep learning model that combines many different datasets that have different biases or measurement errors and sometimes distributions within both their features and labels.
For the ...
1
vote
1
answer
113
views
Unable to obtain link to download the ADE20k-full dataset
I was trying to register here (https://groups.csail.mit.edu/vision/datasets/ADE20K/) to get the link to download the ADE20k-full dataset but I keep getting the following error: "Fatal error ...
1
vote
1
answer
55
views
Chart.js - Alignment of 0 values for multiple datasets
I am trying to compare 2 datasets in chart.js. There are large differences in values so in order to compare the values a have decided to graph the data on 2 different y axes.
1 of the datasets has ...
1
vote
1
answer
61
views
Dataset scaling in a Chart.js bar chart for better comparison
I’m using Chart.js to graph two datasets on the same bar chart for a time-based comparison (x-axis). However, I’m encountering an issue where the height of one graph squashes or expands to fit the ...
0
votes
1
answer
29
views
How to index over Datasets in Xarray
I have an array that looks like this:
<xarray.Dataset>
Dimensions: (x: 1536, y: 1440)
Coordinates:
lon (x, y) float64 -8.387 -8.372 -8.358 ... 16.65 16.67
...
0
votes
0
answers
38
views
How do I access batched image filenames while using 'image_dataset_from_directory'?
I have a database of images that are close to 10 million images. I am using the classical definition below to build a dataset:
trainData = image_dataset_from_directory(
...
0
votes
0
answers
30
views
Plotting missing values - python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pylab import rcParams
import missingno as msno
import warnings # Ignores any warning
rcParams['figure....
2
votes
2
answers
119
views
How to generate/find dataset of global cities in multiple languages?
I'm trying to create a global city dataset in multiple languages. I found a few approaches, but none seem reliable enough. Looking for free/cheap solutions. Below is an ideal example of CSV data I'd ...
0
votes
0
answers
24
views
i cant pass the automated test
def topN_pos(csv_file_path, N):
"""
Example:
>>> topN_pos('train.csv', 3)
output would look like [(noun1, 22), (noun2, 10), ...]
"""
# ...
0
votes
0
answers
12
views
Trying to plot two data sets pertaining to sea ice data on to one graph and one of the data sets won't plot
I am trying to plot two separate datasets pertaining to sea ice data onto one graph. The first data set concerns the years 2017 to 2021, and the second set looks at the average sea extent by day for ...
0
votes
0
answers
55
views
Create Kedro PartitionedDataset of PartitionedDatasets
I'm working in a kedro project where I want to automatically label thousands of audio files, apply transformations to them and then store them in a folder of folders, each subfolder corresponding to ...
0
votes
0
answers
26
views
VQA v2.0 Dataset: more than 40000 images are missing
I'm working on training a model on VQA v2.0 dataset. The dataset can be found at https://visualqa.org/download.html
I installed the zip files containing image data (all splits) under the section ...
0
votes
1
answer
53
views
Trouble creating a dataset from h5 file
I'm having a problem regarding the h5 file. Whenever I try to load it I get this error:
OSError: Can't read data (can't open directory: /opt/conda/envs/AE4353/lib/hdf5/plugin)
I can read the file ...
1
vote
1
answer
60
views
How to make a BigQuery data backup minimizing financial costs?
I'm helping to close a organization that is running out of business. One of the tasks is to make a backup of all our datasets in BigQuery for at least 5 year for legal purposes. Since this is quite a ...
-1
votes
1
answer
58
views
SQL DataAdapter Removes # Character and Concatenates Values (e.g., 10#1000 becomes 101000) When Filling DataSet
I'm using a stored procedure to fetch data in a C# application. The SQL query returns the correct values when executed directly in SQL Server, but when I use a SqlDataAdapter to fill a DataSet, the ...
3
votes
1
answer
119
views
PyTorch: difference between reshape() and view() method
What is the difference between reshape and view method and why do we need and I am using pytorch tensors and working on changing the shape of data then I came to know these two functions. what are the ...
0
votes
2
answers
43
views
ADF-Tick or Untick the First Row as header in a Dynamic Dataset based upon the table
we use a generic dataset to move two csv file to externally control file and the actual data file (ctl & Dat)
In the Sink we use a generic dataset as shown above
The ctl file, header, should not,...
0
votes
1
answer
47
views
Excluding records between dates AND with an associating value
I am trying return a dataset that fall between a 30 day window and belong to another set of values.
Here is how the table looks:
| EMAIL | LIST | CREATED_DATE |
--------------------------...
1
vote
0
answers
27
views
dbMEM for community matrix with spatial and envrionment data [closed]
I have two community data sets, one for reef fish abundance per site and another for coral cover in the same site. I also have environment data with stony coral cover and rugosity index, and a matrix ...
-1
votes
1
answer
85
views
getting error when scheduled refresh reports in power bi
There are some reports in power bi report server. Each of these reports has scheduled refresh times. However, some reports give a memory error error during scheduled refresh. Delays occur in the work ...
0
votes
0
answers
37
views
Slow Data Loading and Low GPU Utilization in PyTorch Federated Learning with Frequent Client Switching
I'm working on a federated learning project using PyTorch, focusing on medical imaging (MRI) data. Despite using an SSD, the dataset loading phase is unusually slow, and the GPU utilization remains ...
0
votes
0
answers
30
views
Is there any open/legal collection of open profile Instagram reels, stories or similar short videos to train our models on?
I am developing a model for which I need a collection (100k+ at least) of reels and stories of the kind that are posted to Instagram. While there are scrapers and bots I was wondering if there is an ...
0
votes
3
answers
38
views
Expand the data set to include all combinations of the values of the variables using R
I have the following snippet of my data set that consists of thousands of observations
and I want to expand this data set using R so as to include all combiations of the first two columns
that is, I ...
0
votes
1
answer
42
views
Python script to read credentials from a yaml file, read a dataset and update information
I'm working on an academic project where I need to create a python script to change credentials from a yaml file. This script should read the yaml file, then look for the value of Oauth 1 in a ...
1
vote
2
answers
54
views
Creating flag based on matching values in the variable across two datasets
I have two datasets , I have to check if the values are matching across the rows, if they are matching the flag should return a value as pass else fail.
datasets that I have :
Dataset 1 ...
0
votes
0
answers
107
views
Processing dataset for Llama 3.1 Instruct using PyTorch
Hey I am finetuning Llama 3.1 Instruct using PyTorch. I was wondering how to correctly process the dataset. Can you tell me if I am using the prompt template correctly and can I pass the whole prompt ...
0
votes
0
answers
49
views
Understanding the `model.fit` function in keras and imbalanced datasets
As an exercise, I'm trying to translate a model written in Keras (https://github.com/CVxTz/ECG_Heartbeat_Classification/blob/master/code/baseline_mitbih.py) into Pytorch code. I realize in Keras much ...
0
votes
0
answers
42
views
How to create a dataset that combines 2 datasets and then create dataloader for pytorch?
I am trying to a semi-supervised learning.
I first train on the small set of training set (represented by training on the test set, instead of normal training set), then during validation, I take the ...
0
votes
0
answers
19
views
Python - Dataset anonimization script by grouping variables
TL;DR: How can I code an aggregation, in python, to guarantee is impossible to identify an individual, retaining as much data as possible, and avoiding groups too large?
Example
Imagine a dataset of ...
0
votes
1
answer
49
views
Hitting a wall attempting C# method using OpenXML SDK to export all the DataTables in a DataSet to be sheets in a new Excel Workbook
Here is a partial method to create a byte array that could be written to an xslx file that takes a DataSet and exports all the tables in that DataSet as sheets in an Excel workbook.... EXCEPT that all ...
0
votes
1
answer
65
views
type data for props undefined in local component vue3.js with import tag Vue3, Vue3 needed for a alone page
The use-noms-cpnt component is as follows:
<div id="app">
<use-noms-cpnt></use-noms-cpnt>
</div>
I have an App component and an App mounted like this, I put a props ...