Duplicate Product Lines

This page details our solution to the Define Duplicate Product Lines challenge from dmcommunity.org

Define Duplicate Product Lines

Let’s assume that your organization handles sales orders similar to the one below:

Product SKU

Price

Quantity

SKU-1000

10

10

SKU-1

20

3

SKU-2

30

1

SKU-1

20

4

SKU-3

40

6

SKU-3

42

8

SKU-4

15

2

Create a decision model that is capable to decide which product lines inside such orders are duplicate. The duplicate product lines can be recognized by user-defined similarity rules, e.g. two product lines are considered “duplicate” if they have the same “sku” and “price”.

To solve this challenge, we simply need to check whether two orders have the same properties. To be able to model these properties, we first define them in our glossary. We create a type for every product (one to seven), a type containing the possible SKU’s, and a type representing the quantities.

Type
Name	Type	Values
Product	Int	[1..7]
SKU	String	SKU1000, SKU1, SKU2, SKU3, SKU4
Quantity	Int	[0..100]

Because every product has exactly one SKU, Price and Quantity, we model these properties using functions.

Function
Name	Type
sku of Product	SKU
price of Product	Int
quantity of Product	Quantity

We also need a way to express that two products are duplicates of each other. For this purpose, we introduce a relation.

Relation
Name
Product is duplicate of Product

Now our glossary is complete, and we can start to add logic. For this specific challenge, the logic is simple: if two products share the same SKU and Price, they are duplicates. We express this in a table as follows.

Define duplicates
U	Product called p1	Product called p2	sku of p1	price of p1	p1 is duplicate of p2
1	-	not(p1)	sku of p2	price of p2	Yes

In other words, the table above states “For every Product p1 and p2 (not equal to p1), if they have the same SKU and the same Price, they are duplicates.” Or, “Duplicates are defined as two products that share the same SKU and Price”.

Now, all that remains is add in our data table containing the product information.

Product
D	Product	sku of Product	price of Product	quantity of Product
1	1	SKU1000	10	10
2	2	SKU1	20	3
3	3	SKU2	30	1
4	4	SKU1	20	4
5	5	SKU3	40	6
6	6	SKU3	42	8
7	7	SKU4	15	2

Running the solver using this model gives us the following solution.

Model 1
==========
sku_of_Product := {1->SKU1000, 2->SKU1, 3->SKU2, 4->SKU1, 5->SKU3, 6->SKU3, 7->SKU4}.
price_of_Product := {1->10, 2->20, 3->30, 4->20, 5->40, 6->42, 7->15}.
quantity_of_Product := {1->10, 2->3, 3->1, 4->4, 5->6, 6->8, 7->2}.
Product_is_duplicate_of_Product := {(2,4), (4,2)}.

Elapsed Time:
0.871