Skip to content

Conversation

@JonaLoeffler
Copy link

@JonaLoeffler JonaLoeffler commented Apr 2, 2025

This PR contains a prototype for #891

I know this implementation might have a few issues, I just wanted to get the ball rolling and understand how one would even start to work on this feature.

I am actually more interested in supporting Digikam's face tags, which are stored as Exif tags as well.

@JonaLoeffler
Copy link
Author

Maybe this should be behind a settings flag to be disabled by default? I can imagine this will be surprising to many users if enabled by default, but the option will be appreciated by others.

@NJugel
Copy link

NJugel commented Apr 4, 2025

Maybe this should be behind a settings flag to be disabled by default? I can imagine this will be surprising to many users if enabled by default, but the option will be appreciated by others.

Wow, thanks for the first step. That looks very promising. Unfortunately, I'm not a developer in the community, but I'm happy to see the first real progress.
Maybe the flag can be set in the import popup.

@pktiuk
Copy link
Contributor

pktiuk commented Apr 16, 2025

I think, that a good to merge, and doable POC would be displaying a simple list of tags in the image info panel
obraz

For the sake of simplicity it would be read-only, but it would be a good start.

@bastiion
Copy link

bastiion commented Aug 18, 2025

I've been working on extending this embedded tags feature and wanted to share some thoughts on the approach.

First off, thanks @JonaLoeffler for getting the ball rolling on this! Having support for embedded tags from EXIF metadata is a great addition to Memories, especially for users like me who tag their photos in other applications.

I've been thinking about how to handle these tags, particularly in shared environments where privacy becomes important. Instead of using the systemtags table, I've implemented a dedicated table specifically for embedded tags that preserves their hierarchical structure.

The approach I've taken is:

  1. Store tags with the owner's user ID in a dedicated memories_embedded_tags table
  2. Extract tags from various EXIF fields (TagsList, HierarchicalSubject, Keywords, Subject)
  3. Preserve the hierarchical structure of tags (like Animals/Mammals/Dogs)
  4. Ensure tags don't leak between users who shouldn't have access to them

I agree with @NJugel that this should probably be behind a settings flag initially. And @pkciuk's suggestion of starting with a simple read-only display in the image info panel is a great first step.

One thing I've been particularly concerned about is how to handle tags in shared photos. We need to be careful with sensitive tags (like face tags or classified information). I'm thinking of a layered approach:

  • By default, users see tags from their own photos
  • For shared content, we could optionally create a mapping that respects Nextcloud's sharing permissions
  • For small sets of photos, we could extract tags directly from EXIF in the browser as a fallback

This gives us flexibility while maintaining privacy. The base implementation is straightforward, but it can be extended for more complex sharing scenarios.

I created a separate PR that implements part's of the approach I've described here and and the ideas mentioned here.

@Offerel
Copy link

Offerel commented Sep 8, 2025

Based on my Plugin for Roundcube, i have created a small PHP file. Save it as phtags.php somewhere, where your webserver user can read and write data. This should be not accessible by the browser. Replace the values for database, dbusername/password and so on, with your own values.

<?php
$supported = array("jpg", "jpeg", "png", "webp", "gif", "tif");
$logfile = '/var/log/systemtags.log';
$db_server = 'servername or ip';
$db_name = 'databasename';
$db_user = 'databaseuser';
$db_pawd = 'databasepassword';
$lrunf = dirname(__FILE__).'/phtags_last.run';
$stime = time();



try {
	$db = new PDO("mysql:host=$db_server;dbname=$db_name;charset=utf8mb4", $db_user, $db_pawd);
	$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
	$db->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
} catch(PDOException $e) {
	logm("Connection failed: " . $e->getMessage(), 1);
	die();
}

$bdir = (isset($argv[1])) ? getBdir($argv[1]):1;
if($bdir === 1) die(logm("Wrong dir", 1));

function getBdir($path) {
	global $db;

	if(!str_contains($path, 'files')) return 1;

	$stmt = $db->prepare("SELECT `storage_id` FROM `oc_mounts` WHERE `mount_point` = ?");
	if ($stmt->execute(array($path))) {
		while ($row = $stmt->fetch()) {
			$storage_id = $row['storage_id'];
		}
	}

	$stmt = $db->prepare("SELECT `id` FROM `oc_storages` WHERE `numeric_id` = ?");
	if ($stmt->execute(array($storage_id))) {
		while ($row = $stmt->fetch()) {
			$id = $row['id'];
		}
	}

	if(!str_contains($id, 'local::')) return 1;

	return substr($id, 7);
}

$lastrun = file_get_contents($lrunf);
scanGallery($bdir);
file_put_contents($lrunf, $stime);

function scanGallery($dir) {
	global $supported, $lastrun;
	$images = array();
	foreach (new DirectoryIterator($dir) as $fileInfo) {
		if (!$fileInfo->isDot()) {
			if ($fileInfo->isDir()) {
				scanGallery($fileInfo->getPathname());
			} else {
				$filename = pathinfo($fileInfo->getFilename());
				if(isset($filename['extension']) && in_array(strtolower($filename['extension']), $supported)) {
					$images[] = $dir.'/'.$filename['basename'];
				}
			}
		}
	}

	if (count($images) > 0) {
		foreach ($images as $key => $image) {
			if(basename($image)[0] == ".") {
				unset($images[$key]);
				logm("Ignore hidden $image", 4);
				continue;
			}

			if($lastrun >= filemtime($image)) {
				unset($images[$key]);
				logm("Ignore already tagged $image", 4);
				continue;
			}
		}

		$chunks = array_chunk($images, 1000, true);
		if(count($chunks) > 0) {
			foreach ($chunks as $key => $chunk) {
				$imgarr = array();

				foreach ($chunk as $image) {
					logm("Get exif data for $image", 3);
					ini_set('exif.decode_unicode_motorola','UCS-2LE');
					$exif_data = @exif_read_data($image);
					$gis = getimagesize($image, $info);
					$exif_arr = array();
					$exif_arr['SourceFile'] = $image;

					if(is_array($exif_data)) {
						/*
						(isset($exif_data['Model'])) ? $exif_arr['Model'] = $exif_data['Model']:"";
						(isset($exif_data['FocalLength'])) ? $exif_arr['FocalLength'] = parse_fraction($exif_data['FocalLength']):"";
						(isset($exif_data['FNumber'])) ? $exif_arr['FNumber'] = parse_fraction($exif_data['FNumber'],2):"";
						(isset($exif_data['ISOSpeedRatings'])) ? $exif_arr['ISO'] = $exif_data['ISOSpeedRatings']:"";
						(isset($exif_data['DateTimeOriginal'])) ? $exif_arr['DateTimeOriginal'] = strtotime($exif_data['DateTimeOriginal']):filemtime($image);
						(isset($exif_data['ImageDescription'])) ? $exif_arr['ImageDescription'] = $exif_data['ImageDescription']:"";
						(isset($exif_data['Make'])) ? $exif_arr['Make'] = $exif_data['Make']:"";
						(isset($exif_data['Software'])) ? $exif_arr['Software'] = $exif_data['Software']:"";
						(isset($exif_data['Flash'])) ? $exif_arr['Flash'] = $exif_data['Flash']:"";
						(isset($exif_data['ExposureProgram'])) ? $exif_arr['ExposureProgram'] = $exif_data['ExposureProgram']:"";
						(isset($exif_data['MeteringMode'])) ? $exif_arr['MeteringMode'] = $exif_data['MeteringMode']:"";
						(isset($exif_data['WhiteBalance'])) ? $exif_arr['WhiteBalance'] = $exif_data['WhiteBalance']:"";
						(isset($exif_data["GPSLatitude"])) ? $exif_arr['GPSLatitude'] = gps($exif_data['GPSLatitude'],$exif_data['GPSLatitudeRef']):"";
						(isset($exif_data["GPSLongitude"])) ? $exif_arr['GPSLongitude'] = gps($exif_data['GPSLongitude'],$exif_data['GPSLongitudeRef']):"";
						(isset($exif_data['Orientation'])) ? $exif_arr['Orientation'] = $exif_data['Orientation']:"";
						(isset($exif_data['ExposureTime'])) ? $exif_arr['ExposureTime'] = $exif_data['ExposureTime']:"";
						(isset($exif_data['ShutterSpeedValue'])) ? $exif_arr['TargetExposureTime'] = shutter($exif_data['ShutterSpeedValue']):"";
						(isset($exif_data['UndefinedTag:0xA434'])) ? $exif_arr['LensID'] = $exif_data['UndefinedTag:0xA434']:"";
						(isset($exif_data['MimeType'])) ? $exif_arr['MIMEType'] = $exif_data['MimeType']:"";
						(isset($exif_data['DateTimeOriginal'])) ? $exif_arr['CreateDate'] = strtotime($exif_data['DateTimeOriginal']):"";
						*/
						$exif_arr['Keywords'] = (isset($info['APP13'])) ? iptc_keywords($info['APP13']):"";
						/*
						(isset($exif_data['Artist'])) ? $exif_arr['Artist'] = $exif_data['Artist']:"";
						(isset($exif_data['Copyright'])) ? $exif_arr['Copyright'] = $exif_data['Copyright']:"";
						(isset($exif_data['Description'])) ? $exif_arr['Description'] = $exif_data['Description']:"";
						(isset($exif_data['Title'])) ? $exif_arr['Title'] = $exif_data['Title']:"";
						(isset($exif_data['COMPUTED']['Width'])) ? $exif_arr['ExifImageWidth'] = $exif_data['COMPUTED']['Width']:"";
						(isset($exif_data['COMPUTED']['Height'])) ? $exif_arr['ExifImageHeight'] = $exif_data['COMPUTED']['Height']:"";
						*/
					}

					$imgarr[] = array_filter($exif_arr);
				}

				new_keywords($imgarr);

				foreach ($imgarr as $key => $img) {
					setKW($img);
				}
			}
		}
	}
}

function setKW($img) {
	global $db, $bdir;
	$cPath = substr($img['SourceFile'], strlen($bdir));
	$data = [];

	if(isset($img['Keywords'])) {
		$kArr = explode(', ', $img['Keywords']);

		$stmt = $db->prepare("SELECT `fileid` FROM `oc_filecache` WHERE `path` = ?");
		if ($stmt->execute(array($cPath))) {
			while ($row = $stmt->fetch()) {
				$fileid = $row['fileid'];
			}
		}

		foreach ($kArr as $key => $keyword) {
			$query = "SELECT `id` FROM `oc_systemtag` WHERE `name` = '$keyword'";
			foreach ($db->query($query) as $row) {
				$data[] = [$fileid, 'files', $row['id']];
			}
		}

		foreach ($data as $key => $mapping) {
			$query = "INSERT IGNORE INTO `oc_systemtag_object_mapping` (`objectid`, `objecttype`, `systemtagid`) VALUES (?, ?, ?)";
			$stmt = $db->prepare($query);
			try {
				$db->beginTransaction();
				foreach ($data as $row) {
					$stmt->execute($row);
				}
				$db->commit();
			} catch (Exception $e){
				$db->rollback();
				throw $e;
			}
		}
	} else {
		logm("No Keyword in ".$cPath, 3);
	}
}

function logm($message, $mmode = 3) {
	global $logfile;
	$dtime = date("Y-m-d H:i:s");

	switch($mmode) {
		case 1: $msmode = " [ERRO] "; break;
		case 2: $msmode = " [WARN] "; break;
		case 3: $msmode = " [INFO] "; break;
		case 4: $msmode = " [DBUG] "; break;
	}

	$line = $dtime.$msmode.$message."\n";
	echo $line;
	file_put_contents($logfile, $line, FILE_APPEND);
}

function new_keywords($arr) {
	global $db;
	$kw = array();
	foreach ($arr as $key => $image) {
		$keywords = isset($image['Keywords']) ? explode(', ', $image['Keywords']):array();
		$kw = array_merge($kw, $keywords);
	}
	$kw = array_unique($kw);

	$tags = array();
	$query = "SELECT `name` FROM `oc_systemtag` ORDER BY `name`;";

	foreach ($db->query($query) as $row) {
		array_push($tags, $row['name']);
	}

	$diff = array_diff($kw, $tags);

	$vals = [];
	if(is_array($diff) && count($diff) > 0) {
		foreach ($diff as $key => $keyword) {
			$vals[] = [mb_convert_encoding($keyword, 'utf8'), uniqid('phtags_', true)];
		}
		
		$stmt = $db->prepare("INSERT IGNORE INTO `oc_systemtag` (`name`, `etag`) VALUES (?, ?)");
		try {
			$db->beginTransaction();
			foreach ($vals as $row) {
				$stmt->execute($row);
			}
			$db->commit();
		} catch (Exception $e){
			$db->rollback();
			throw $e;
		}
	}
}

function parse_fraction($v, $round = 0) {
	list($x, $y) = array_map('intval', explode('/', $v));
	if (empty($x) || empty($y)) {
		return $v;
	}
	if ($x % $y == 0) {
		return $x / $y;
	}
	if ($y % $x == 0) {
		return "1/" . $y / $x;
	}
	return round($x / $y, $round);
}

function shutter($value) {
	$pos = strpos($value, '/');
	$a = (float) substr($value, 0, $pos);
	$b = (float) substr($value, $pos + 1);
	$apex = ($b == 0) ? ($a) : ($a / $b);
	$shutter = pow(2, -$apex);
	if ($shutter == 0) return false;
	if ($shutter >= 1) return round($shutter);
	return '1/'.round(1 / $shutter);
}

function gps($exifCoord, $hemi) {
	$degrees = count($exifCoord) > 0 ? gps2Num($exifCoord[0]) : 0;
	$minutes = count($exifCoord) > 1 ? gps2Num($exifCoord[1]) : 0;
	$seconds = count($exifCoord) > 2 ? gps2Num($exifCoord[2]) : 0;

	$flip = ($hemi == 'W' or $hemi == 'S') ? -1 : 1;
	return $flip * ($degrees + $minutes / 60 + $seconds / 3600);
}

function gps2Num($coordPart) {
	$parts = explode('/', $coordPart);
	if (count($parts) <= 0) return 0;
	if (count($parts) == 1) return $parts[0];

	$f = floatval($parts[0]);
	$s = floatval($parts[1]);

	$e = ($s == 0) ? 0:$f/$s;
	return $e;
}

function iptc_keywords($iptcdata) {
	if(isset(iptcparse($iptcdata)['2#025'])) {
		$keywords = implode(', ', iptcparse($iptcdata)['2#025']);
	} else {
		$keywords = null;
	}
	return $keywords;
}
?>

This file should not be called from the browser, but via the command line. For example, with sudo -u www-data php /path/to/phtags.php '/ncuser/files/Photos/'. www-data is the webserver user, the second path is the Nextcloud internal path to someones photos. It scans the path for Photos with tags, adds these tags to the oc_systemtags table. Every tag gets a etag with a phtags_ prefix in this table, so you can easily revert all these tags. After that, it creates the file-tags mapping in oc_systemtag_object_mapping. The actions are logged in '/var/log/systemtags.log'. The script created a file phtags_last.run. Here it saves the timestamp where the last run was started. On start, it reads the timestamp and compares it with last modified time of the photo, to check if it was already scanned, if so, the photo will be ignored.

I just fiddled around with this during my lunch break, so I suspect there are still various logical and conceptual errors in there. At least the first run successfully read the tags for my user and added them to the photos. It's important to note that these are system tags. This means that the tags are not at the user level, but system-wide. A shared photo will therefore also show the same tags for another user. Personally, I would prefer a tagging table exclusively for memories at the user level. However, the current structure does not allow for this.

BTW: There are more Metadata, which could be read, thats why there are the commented lines with some other data. For this variant, i use the php internal exif function. This is relative slow, compared to exiftool on the commandline. For a huge directory with more than 1TB of data, i would recommend to use exiftool on commandline instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants