Uploading datasets

Hey all, 
I am working with dataset uploading and I stumbled upon something. 

https://github.com/openml/openml-python/blob/f618d816d540b033368453a50a13b76423e43adc/openml/datasets/dataset.py#L373-L397

The function publish() in the OpenMLDataset makes use of the xml description of a dataset and an arff file to upload a dataset at OpenML. However in the way that the class is implemented right now, **self.data_file** is a string containing the path to the dataset file.

In my opinion we should have a method that takes the description and the arff file as an argument at openml.datasets at the functions module.

Something like:
`publish_dataset(description, file)`

What is your opinion regarding this?

	def publish(self):
	"""Publish the dataset on the OpenML server.

	Upload the dataset description and dataset content to openml.

	Returns
	-------
	return_code : int
	Return code from server

	return_value : string
	xml return from server
	"""

	file_elements = {'description': self._to_xml()}
	file_dictionary = {}

	if self.data_file is not None:
	file_dictionary['dataset'] = self.data_file

	return_value = _perform_api_call("/data/", file_dictionary=file_dictionary,
	file_elements=file_elements)

	self.dataset_id = int(xmltodict.parse(return_value)['oml:upload_data_set']['oml:id'])
	return self

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uploading datasets #442

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Uploading datasets #442

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions